156
Views
0
CrossRef citations to date
0
Altmetric
Research Article

Zero Initialised Unsupervised Active Learning by Optimally Balanced Entropy-Based Sampling for Imbalanced Problems

ORCID Icon &
Pages 781-814 | Received 03 Nov 2020, Accepted 22 Apr 2021, Published online: 24 May 2021
 

ABSTRACT

Given the challenge of gathering labelled training data for machine learning tasks, active learning has become popular. This paper focuses on the beginning of unsupervised active learning, where there are no labelled data at all. The aim of this zero initialised unsupervised active learning is to select the most informative examples – even from an imbalanced dataset – to be labelled manually. Our solution with proposed selection strategy, called Optimally Balanced Entropy-Based Sampling (OBEBS) reaches a balanced training set at each step to avoid imbalanced problems. Two theorems of the optimal solution for selection strategy are also presented and proved in the paper. At the beginning of the active learning, there is not enough information for supervised machine learning method, thus our selection strategy is based on unsupervised learning (clustering). The cluster membership likelihoods of the items are essential for the algorithm to connect the clusters and the classes, i.e., to find assignment between them. For the best assignment, the Hungarian algorithm is used, and single, multi, and adaptive assignment variants of OBEBS method are developed. Based on generated and real images datasets of handwritten digits, the experimental results show that our method surpasses the state-of-the-art methods.

Acknowledgments

The research was partly supported by the ÚNKP-19-3 New National Excellence Program of the Ministry of Human Capacities. The research has been supported by the European Union, cofinanced by the European Social Fund (EFOP-3.6.2-16-2017-00013, Thematic Fundamental Research Collaborations Grounding Innovation in Informatics and Info-communications).

Disclosure statement

No potential conflict of interest was reported by the author(s).

Correction Statement

This article has been republished with minor changes. These changes do not impact the academic content of the article.

Log in via your institution

Log in to Taylor & Francis Online

PDF download + Online access

  • 48 hours access to article PDF & online version
  • Article PDF can be downloaded
  • Article PDF can be printed
USD 61.00 Add to cart

Issue Purchase

  • 30 days online access to complete issue
  • Article PDFs can be downloaded
  • Article PDFs can be printed
USD 373.00 Add to cart

* Local tax will be added as applicable

Related Research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.