92
Views
1
CrossRef citations to date
0
Altmetric
Articles

A preprocessing method combined with an ensemble framework for the multiclass imbalanced data classification

ORCID Icon & ORCID Icon
Pages 1178-1185 | Received 13 Sep 2019, Accepted 29 Nov 2019, Published online: 10 Dec 2019

References

  • Vo NH, Won Y. Classification of unbalanced medical data with weighted regularized least squares; Jeju City, South Korea): 2007 Frontiers in the Convergence of Bioscience and Information Technologies; 2007.
  • Lerner B, Yeshaya J, Koushnir L. On the classification of a small imbalanced cytogenetic image database. IEEE/ACM Trans Comput Biol Bioinform. 2007;4(2):204–215.
  • Pan J, Fan Q, Pankanti S, et al. Soft margin keyframe comparison: enhancing precision of fraud detection in retail surveillance; Kona, HI, USA): 2011 IEEE Workshop on Applications of Computer Vision (WACV); 2011.
  • Maalouf M, Trafalis TB. Robust weighted kernel logistic regression in imbalanced and rare events data. Comput Stat Data Anal. 2011;55(1):168–183.
  • Wu Q, Ye Y, Zhang H, et al. Forestexter: an efficient random forest algorithm for imbalanced text categorization. Knowl Based Syst. 2014;67:105–116.
  • Wang S, Yao X. Multiclass imbalance problems: analysis and potential solutions. IEEE Trans Syst Man Cybernet B Cybernet. 2012;42(4):1119–1130.
  • Zhang C, Liu C, Zhang X, et al. An up-to-date comparison of state-of-the-art classification algorithms. Expert Syst Appl. 2017;82:128–150.
  • FernáNdez A, López V, Galar M, et al. Analysing the classification of imbalanced data-sets with multiple classes: binarization techniques and ad-hoc approaches. Knowl Based Syst. 2013;42:97–110.
  • Zhang Z, Krawczyk B, Garcìa S, et al. Empowering one-vs-one decomposition with ensemble learning for multi-class imbalanced data. Knowl Based Syst. 2016;106:251–263.
  • He H, Garcia EA. Learning from imbalanced data. IEEE Trans Knowl Data Eng. 2008;9:1263–1284.
  • Garcı S, Derrac J, Triguero I, et al. Evolutionary-based selection of generalized instances for imbalanced classification. Knowl Based Syst. 2012;25(1):3–12.
  • Tahir MA, Kittler J, Yan F. Inverse random under sampling for class imbalance problem and its application to multi-label classification. Pattern Recognit. 2012;45(10):3738–3750.
  • Chawla NV, Bowyer KW, Hall LO, et al. SMOTE: synthetic minority over-sampling technique. J Artif Intell Res. 2002;16:321–357.
  • Wang BX, Japkowicz N. Imbalanced data set learning with synthetic samples. Proceedings of IRIS Machine Learning Workshop; 2004 June 9; Ottawa.
  • He H, et al. ADASYN: adaptive synthetic sampling approach for imbalanced learning; Hong Kong, China): 2008 IEEE International Joint Conference on Neural Networks (IEEE World Congress on Computational Intelligence); 2008.
  • Batista GE, Prati RC, Monard MC. A study of the behavior of several methods for balancing machine learning training data. SIGKDD Explor Newslett. 2004;6(1):20–29.
  • Fernández A, García S, del Jesus MJ, et al. A study of the behaviour of linguistic fuzzy rule-based classification systems in the framework of imbalanced data-sets. Fuzzy Sets Syst. 2008;159(18):2378–2398.
  • Tang Y, Zhang Y-Q, Chawla NV, et al. SVMs modeling for highly imbalanced classification. IEEE Trans Syst Man Cybernet B Cybernet. 2009;39(1):281–288.
  • Barandela R, Sánchez JS, Garcı´a V, et al. Strategies for learning in class imbalance problems. Pattern Recognit. 2003;36(3):849–851.
  • Cieslak DA, Hoens TR, Chawla NV, et al. Hellinger distance decision trees are robust and skew-insensitive. Data Min Knowl Discov. 2012;24(1):136–158.
  • Diamantini C, Potena D. Bayes vector quantizer for class-imbalance problem. IEEE Trans Knowl Data Eng. 2009;21(5):638–651.
  • Pérez-Godoy MD, Rivera AJ, Carmona CJ, et al. Training algorithms for radial basis function networks to tackle learning processes with imbalanced data-sets. Appl Soft Comput. 2014;25:26–39.
  • Hoens TR, et al. Building decision trees for the multi-class imbalance problem. Pacific-Asia Conference on Knowledge Discovery and Data Mining; Berlin, Heidelberg: Springer; 2012.
  • Abdi L, Hashemi S. To combat multi-class imbalanced problems by means of over-sampling and boosting techniques. Soft Comput. 2015;19(12):3369–3385.
  • Domingos P. Metacost: a general method for making classifiers cost-sensitive. KDD . 1999 San Diego, California, USA.
  • Zhou Z, Liu X. On multi-class cost-sensitive learning. Comput Intell. 2010;26(3):232–257.
  • Soda P. A multi-objective optimisation approach for class imbalance learning. Pattern Recognit. 2011;44(8):1801–1810.
  • Sun Y, Wong AKC, Wang Y. Cost-sensitive boosting for classification of imbalanced data. Pattern Recognit. 2007;40(12):3358–3378.
  • Ting KM. An instance-weighting method to induce cost-sensitive trees. IEEE Trans Knowl Data Eng. 2002;14:659–665.
  • Zadrozny B, Langford J, Abe N. Cost-sensitive learning by cost-proportionate example weighting. ICDM. 2003;3:435), Melbourne, Florida, USA.
  • Woźniak M, Graña M, Corchado E. A survey of multiple classifier systems as hybrid systems. Inf Fusion. 2014;16:3–17.
  • Díez-Pastor JF, Rodríguez JJ, García-Osorio C, et al. Random balance: ensembles of variable priors classifiers for imbalanced data. Knowl Based Syst. 2015;85:96–111.
  • Bolón-Canedo V, Sánchez-Maroño N, Alonso-Betanzos A. An ensemble of filters and classifiers for microarray data classification. Pattern Recognit. 2012;45(1):531–539.
  • Krawczyk B, Cano A, Woźniak M. Selecting local ensembles for multi-class imbalanced data classification; Rio de Janeiro, Brazil): 2018 International Joint Conference on Neural Networks (IJCNN).; 2018.
  • Yijing L, Haixiang G, Xiao L, et al. Adapted ensemble classification algorithm based on multiple classifier system and feature selection for classifying multi-class imbalanced data. Knowl Based Syst. 2016;94:88–104.
  • Sun Z, Song Q, Zhu X, et al. A novel ensemble method for classifying imbalanced data. Pattern Recognit. 2015;48(5):1623–1637.
  • Seiffert C, Khoshgoftaar TM, Van Hulse J, et al. RUSBoost: a hybrid approach to alleviating class imbalance. IEEE Trans Syst Man Cybernet A Syst Hum. 2010;40(1):185–197.
  • Aggarwal CC. Data classification: Algorithms and applications. CRC Press; 2014.
  • Triguero I, González S, Moyano JM, et al. KEEL 3.0: an open source software for multi-stage analysis in data mining; 2017.
  • Sánchez-Crisostomo JP, Alejo R, López-González E, et al. Empirical analysis of assessments metrics for multi-class imbalance learning on the back-propagation context. ICSI. 2014;Hefei, Anhui Province, China:17–23.
  • García S, Zhang ZL, Altalhi A, et al. Dynamic ensemble selection for multi-class imbalanced datasets. Inf Sci. 2018;445-446:22–37.
  • Fernández-Navarro F, Hervás-Martínez C, Gutiérrez PA. A dynamic over-sampling procedure based on sensitivity for multiclass problems. Pattern Recognit. 2011;44(8):1821–1833.
  • Datta S, Das S. Near-Bayesian support vector machines for imbalanced data classification with equal or unequal misclassification costs. Neural Netw. 2015;70:39–52.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.