156
Views
0
CrossRef citations to date
0
Altmetric
Research Article

Zero Initialised Unsupervised Active Learning by Optimally Balanced Entropy-Based Sampling for Imbalanced Problems

ORCID Icon &
Pages 781-814 | Received 03 Nov 2020, Accepted 22 Apr 2021, Published online: 24 May 2021

References

  • Akkasi, A., Varoğlu, E., & Dimililer, N. (2018). Balanced undersampling: A novel sentence-based undersampling method to improve recognition of named entities in chemical and biomedical text. Applied Intelligence, 48(8), 1965–1978. https://doi.org/10.1007/s10489-017-0920-5
  • Cai, W., Zhang, Y., & Zhou, J. (2013). Maximizing expected model change for active learning in regression. IEEE 13th International Conference on Data Mining, 51–60, Dallas, TX, USA.
  • Cebron, N., & Berthold, M. R. (2009). Active learning for object classification: From exploration to exploitation. Data Mining and Knowledge Discovery, 18(2), 283–299. https://doi.org/10.1007/s10618-008-0115-0
  • Chawla, N. V., Bowyer, K. W., Hall, L. O., & Kegelmeyer, W. P. (2002). SMOTE: Synthetic minority over-sampling technique. Journal of Artificial Intelligence Research, 16(1), 321–357. https://doi.org/10.1613/jair.953
  • Chen, Z., Lin, T., Xia, X., Xu, H., & Ding, S. (2018). A synthetic neighborhood generation based ensemble learning for the imbalanced data classification. Applied Intelligence, 48(8), 2441–2457. https://doi.org/10.1007/s10489-017-1088-8
  • Diao, L., Yang, C., & Wang, H. (2012). Training SVM email classifiers using very large imbalanced dataset. Journal of Experimental & Theoretical Artificial Intelligence, 24(2), 193–210. https://doi.org/10.1080/0952813X.2011.610033
  • Ertekin, S., Huang, J., & Giles, C. L. (2007). Active learning for class imbalance problem. In Proceedings of the ACM SIGIR Conference on Research and Development in Information Retrieval - SIGIR '07, Amsterdam, July 23-27, 2007, pp. 823–824. https://doi.org/10.1145/1277741.1277927
  • Fernández, A., Garcia, S., Herrera, F., & Chawla, N. V. (2018). Smote for learning from imbalanced data: Progress and challenges, marking the 15-year anniversary. Journal of Artificial Intelligence Research, 61(1), 863–905. https://doi.org/10.1613/jair.1.11192
  • Garg, V., & Kalai, A. T. (2018). 32nd Conference on Neural Information Processing Systems (NeurIPS 2018), Montreal, Canada, 3-8 December 2018, Montreal, Canada. https://papers.nips.cc/paper/2018/file/72e6d3238361fe70f22fb0ac624a7072-Paper.pdf
  • Hertel, L., Barth, E., Käster, T., & Martinetz, T. (2015). Deep convolutional neural networks as generic feature extractors. International Joint Conference on Neural Networks (IJCNN), 1–4. Killarney, Ireland, July 12–16, 2015.
  • Hospedales, T. M., Gong, S., & Xiang, T. (2012). A unifying theory of active discovery and learning. In European conference on computer vision (pp. 453–466). Springer, Berlin, Heidelberg.
  • Hu, R., Mac Namee, B., & Delany, S. J. (2010). Off to a good start: Using clustering to select the initial training set in active learning. In Twenty-Third International Florida Artificial Intelligence Research Society Conference (FLAIRS 2010). (pp. 26–31), Daytona Beach, Florida. May 19–21, 2010.
  • Jin, G., Liu, F., Wu, H., & Song, Q. (2020). Deep learning-based framework for expansion, recognition and classification of underwater acoustic signal. Journal of Experimental & Theoretical Artificial Intelligence, 32(2), 205–218. https://doi.org/10.1080/0952813X.2019.1647560
  • Kang, J., Ryu, K. R., & Kwon, H. C. (2004). Using cluster-based sampling to select initial training set for active learning in text classification. Pacific-Asia Conference on Knowledge Discovery and Data Mining (pp. 384–388). Springer, Berlin, Heidelberg.
  • Koziarski, M. (2020). Radial-Based Undersampling for Imbalanced Data Classification. Pattern Recognition, 102, 107262. https://doi.org/10.1016/j.patcog.2020.107262
  • Koziarski, M., Krawczyk, B., & Woźniak, M. (2017). Radial-based approach to imbalanced data oversampling. In International Conference on Hybrid Artificial Intelligence Systems (pp. 318–327). Springer, Cham.
  • Kuhn, H. W. (1955). The Hungarian method for the assignment problem. Naval Research Logistics Quarterly, 2(1‐2), 83–97. https://doi.org/10.1002/nav.3800020109
  • LeCun, Y., Bottou, L., Bengio, Y., & Haffner, P. (1998). Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86( 11), 2278–2324. https://doi.org/10.1109/5.726791
  • Lewis, D., & Gale, W. (1994). A sequential algorithm for training text classifiers. In Proceedings of the ACM SIGIR Conference on Research and Development in Information Retrieval (pp. 3–12). ACM/Springer, Dublin, Ireland.
  • Luque, A., Carrasco, A., Martín, A., & De Las Heras, A. (2019). The impact of class imbalance in classification performance metrics based on the binary confusion matrix. Pattern Recognition, 91, 216–231. https://doi.org/10.1016/j.patcog.2019.02.023
  • Mac Aodha, O., Campbell, N., Kautz, J., & Brostow, G. (2014). Hierarchical sub-query evaluation for active learning on a graph. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 564–571., Columbus, OH, USA.
  • Muhammad, G., & Alhamid, M. F. (2017). User emotion recognition from a larger pool of social network data using active learning. Multimedia Tools and Applications, 76(8), 10881–10892. https://doi.org/10.1007/s11042-016-3912-2
  • Panda, N., Goh, K. S., & Chang, E. Y. (2006). Active learning in very large databases. Multimedia Tools and Applications, 31(3), 249–267. https://doi.org/10.1007/s11042-006-0043-1
  • Papp, D., & Szűcs, G. (2017). Balanced active learning method for image classification. Acta Cybernetica, 23(2), 645–658. https://doi.org/10.14232/actacyb.23.2.2017.13
  • Papp, D., & Szűcs, G. (2018). Extended Margin and Soft Balanced Strategies in Active Learning. In European Conference on Advances in Databases and Information Systems (pp. 69–81). Springer, Cham. https://doi.org/10.1007/978-3-319-98398-1_5
  • Papp, D., Szűcs, G., & Zs., K. (2019). Difference based query strategies in active learning. In Proceedings of the IEEE 17th International Symposium on Intelligent Systems and Informatics (SISY 2019) (pp. 35–39), Subotica, Serbia. https://doi.org/10.1109/sisy47553.2019.9111587
  • Reynolds, D. (2009). Gaussian mixture models. In: Li S.Z., Jain A. (Eds.), Encyclopedia of biometrics (pp. 659-663). Springer. https://doi.org/10.1007/978-0-387-73003-5_196
  • Rhee, P. K., Erdenee, E., Kyun, S. D., Ahmed, M. U., & Jin, S. (2017). Active and semi-supervised learning for object detection with imperfect data. Cognitive Systems Research, 45, 109–123. https://doi.org/10.1016/j.cogsys.2017.05.006
  • Safadi, B., & Quénot, G. (2012). Active learning with multiple classifiers for multimedia indexing. Multimedia Tools and Applications, 60(2), 403–417. https://doi.org/10.1007/s11042-010-0599-7
  • Settles, B. (2010). Active learning literature survey. Computer Sciences Technical Report 1648, University of Wisconsin–Madison, 2009. https://research.cs.wisc.edu/techreports/2009/TR1648.pdf.
  • Settles, B., & Craven, M. (2008). An analysis of active learning strategies for sequence labeling tasks. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP) (pp. 1069–1078). ACL Press, Honolulu, Hawaii.
  • Souza, V., Rossi, R. G., Batista, G. E., & Rezende, S. O. (2017). Unsupervised active learning techniques for labeling training sets: An experimental evaluation on sequential data. Intelligent Data Analysis, 21(5), 1061–1095. https://doi.org/10.3233/IDA-163075
  • Sun, F., Xu, Y., & Zhou, J. (2016). Active learning SVM with regularization path for image classification. Multimedia Tools and Applications, 75(3), 1427–1442. https://doi.org/10.1007/s11042-014-2141-9
  • Tsai, Y. L., Tsai, R. T. H., Chueh, C. H., & Chang, S. C. (2014). Cross-domain opinion word identification with query-by-committee active learning. In S. M. Cheng & M. Y. Day (Eds.), TAAI 2014. LNCS (Vol. 8916, pp. 334–343). Springer.
  • Viering, T. J., Krijthe, J. H., & Loog, M. (2019). Nuclear discrepancy for single-shot batch active learning. Machine Learning, 108(8–9), 1561–1599. https://doi.org/10.1007/s10994-019-05817-y
  • Wang, Y., Chen, S., & Zhou, Z. H. (2012). New semi-supervised classification method based on modified cluster assumption. IEEE Transactions on Neural Networks and Learning Systems, 23(5), 689–702. https://doi.org/10.1109/TNNLS.2012.2186825
  • Xie, S., & Philip, S. Y. (2017). Active zero-shot learning: A novel approach to extreme multi-labeled classification. International Journal of Data Science and Analytics, 3(3), 151–160. https://doi.org/10.1007/s41060-017-0042-5
  • Yang, Y., Ma, Z., Nie, F., Chang, X., & Hauptmann, A. G. (2015). Multi-class active learning by uncertainty sampling with diversity maximization. International Journal of Computer Vision, 113(2), 113–127. https://doi.org/10.1007/s11263-014-0781-x
  • Yuan, W., Han, Y., Guan, D., Lee, S., & Lee, Y. K. (2011). Initial training data selection for active learning. In Proceedings of the 5th International Conference on Ubiquitous Information Management and Communication (p. 5). ACM, Seoul, Republic of Korea.
  • Zhang, H., & Li, M. (2014). RWO-Sampling: A random walk over-sampling approach to imbalanced data classification. Information Fusion, 20, 99–116. https://doi.org/10.1016/j.inffus.2013.12.003
  • Zheng, W., & Zhao, H. (2020). Cost-sensitive hierarchical classification for imbalance classes. Applied Intelligence, 50(1), 1–11. https://doi.org/10.1007/s10489-019-01624-z
  • Zhu, J., & Hovy, E. (2007). Active learning for word sense disambiguation with methods for addressing the class imbalance problem. In Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL) (pp. 783–790), Prague, Czech Republic.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.