CrossRef citations to date
Research Article

An ensemble machine learning approach for classification tasks using feature generation

, , &
Article: 2231168 | Received 20 Mar 2023, Accepted 23 Jun 2023, Published online: 11 Jul 2023


  • Abdelhamid, N., Ayesh, A., & Thabtah, F. A. (2014). Phishing detection based associative classification data mining. Expert Systems with Applications, 41(13), 5948–5959. https://doi.org/10.1016/j.eswa.2014.03.019
  • Bohanec, M. (1997). Car Evaluation Data Set. https://archive.ics.uci.edu/ml/datasets/Car+Evaluation
  • Breiman, L. (1996). Stacked regressions. Machine Learning, 24, 49–64. https://doi.org/10.1007/BF00117832
  • Breiman, L. (2001). Random forests. Machine Learning, 45(1), 5–32. https://doi.org/10.1023/A:1010933404324
  • Chatzimparmpas, A., Martins, R. M., Kucher, K., & Kerren, A. (2021). Empirical study: Visual analytics for comparing stacking to blending ensemble learning. In 2021 23rd International Conference on Control Systems and Computer Science (CSCS) (pp. 1–8). https://doi.org/10.1109/CSCS52396.2021.00008
  • Cheeseman, P., & Stutz, J. (1996). Bayesian classification (AutoClass): Theory and results. Advances in Knowledge Discovery & Data Mining, 180, 153–180. https://doi.org/10.5555/257938.257954
  • Chen, J., & Qian, Y. (2021). Hierarchical multi-label ship recognition in remote sensing images using label relation graphs. In 2021 IEEE International Geoscience and Remote Sensing Symposium (IGARSS) (pp. 4968–4971). https://doi.org/10.1109/IGARSS47720.2021.9554687
  • Chiong, R., Fan, Z., Hu, Z., & Dhakal, S. (2022). A novel ensemble learning approach for stock market prediction based on sentiment analysis and the sliding window method. IEEE Transactions on Computational Social Systems, 1–11. https://doi.org/10.1109/TCSS.2022.3182375
  • Cover, T., & Hart, P. (1967). Nearest neighbor pattern classification. IEEE Transactions on Information Theory, 13(1), 21–27. https://doi.org/10.1109/TIT.1967.1053964
  • Cramer, J. S. (2003). The origins of logistic regression (TI 2002-119/4). Social Science Electronic Publishing.
  • Das, B. B., Ram, S. K., Pati, B., Panigrahi, C. R., Babu, K. S., & Mohapatra, R. K. (2021). SVM and ensemble-SVM in EEG-based person identification. In C. R. Panigrahi, B. Pati, P. Mohapatra, R. Buyyaand, & K. C. Li (Eds.), Progress in advanced computing and intelligent engineering (pp. 137–146). Springer Singapore.
  • de Alencar Barreto, G., & da Rocha Neto, A. R. (2011). Vertebral Column Data Set. https://archive.ics.uci.edu/ml/datasets/Vertebral+Column
  • Dietterich, T. G. (1997). Machine learning research: Four current directions. AI Magazine, 18(4), 97–136. https://doi.org/10.1609/aimag.v18i4.1324
  • Dua, D., & Graff, C. (2017). Statlog (Heart) Data Set. https://archive.ics.uci.edu/ml/datasets/Statlog+Heart
  • Er, M. B., & Aydilek, I. B. (2019). Music emotion recognition by using chroma spectrogram and deep visual features. International Journal of Computational Intelligence Systems, 12(2), 1622–1634. https://doi.org/10.2991/ijcis.d.191216.001
  • Ertam, F. (2019). Internet Firewall Data Set. https://archive.ics.uci.edu/ml/datasets/Internet+Firewall+Data
  • Faiz, M. F. I. (2021). SVM-based ensemble classifiers to detect android malware. In L. Barolli, I. Woungangand, & T. Enokido (Eds.), Advanced information networking and applications (pp. 346–354). Springer International Publishing.
  • Fan, Z., & Chiong, R. (2022). Identifying digital capabilities in university courses: An automated machine learning approach. Education and Information Technologies, 28(3), 1–16.https://doi.org/10.1007/s10639-022-11075-8
  • Fan, Z., & Gou, J. (2023). Predicting body fat using a novel fuzzy-weighted approach optimized by the whale optimisation algorithm. Expert Systems with Applications, 217, 119558. https://doi.org/10.1016/j.eswa.2023.119558
  • Fan, Z., Wu, F., & Tang, Y. (2023). A hierarchy-based machine learning model for happiness prediction. Applied Intelligence, 53(6), 7108–7117. https://doi.org/10.1007/s10489-022-03811-x
  • Fisher, R. (1988). Iris data set. https://archive.ics.uci.edu/ml/datasets/Iris
  • Friedman, J. H. (2001). Greedy function approximation: A gradient boosting machine. Annals of Statistics, 29(5), 1189–1232. https://doi.org/10.1214/aos/1013203451
  • Gadde, S., Lakshmanarao, A., & Satyanarayana, S. (2021). SMS spam detection using machine learning and deep learning techniques. In 2021 7th International Conference on Advanced Computing and Communication Systems (ICACCS) (Vol. 1, pp. 358–362). https://doi.org/10.1109/ICACCS51430.2021.9441783
  • Gaikwad, S., Patel, S., & Shetty, A. (2021). Brain tumor detection: An application based on machine learning. In 2021 2nd International Conference for Emerging Technology (INCET) (pp. 1–4). https://doi.org/10.1109/INCET51464.2021.9456347
  • Gao, K., Liu, B., Yu, X., & Yu, A. (2022). Unsupervised meta learning with multiview constraints for hyperspectral image small sample set classification. IEEE Transactions on Image Processing, 31, 3449–3462. https://doi.org/10.1109/TIP.2022.3169689
  • Guo, J. (2021). Prototype calibration with feature generation for few-shot remote sensing image scene classification. Remote Sensing, 13(14), 2728. https://doi.org/10.3390/rs13142728
  • Gyamfi, K. S., Brusey, J., Hunt, A., & Gaura, E. I. (2018). Linear dimensionality reduction for classification via a sequential Bayes error minimisation with an application to flow meter diagnostics. Expert Systems with Applications, 91, 252–262. https://doi.org/10.1016/j.eswa.2017.09.010
  • H, M. Y. (2022). BMI Dataset. https://www.kaggle.com/datasets/yasserh/bmidataset
  • Hansen, L., & Salamon, P. (1990). Neural network ensembles. IEEE Transactions on Pattern Analysis and Machine Intelligence, 12(10), 993–1001. https://doi.org/10.1109/34.58871
  • Ismail Fawaz, H., Forestier, G., Weber, J., Idoumghar, L., & Muller, P. A. (2019). Deep learning for time series classification: A review. Data Mining and Knowledge Discovery, 33(4), 917–963. https://doi.org/10.1007/s10618-019-00619-1
  • Jee, H., Lee, K., & Pan, S. (2004). Eye and face detection using SVM. In Proceedings of the 2004 Intelligent Sensors, Sensor Networks and Information Processing Conference, 2004 (pp. 577–580). https://doi.org/10.1109/ISSNIP.2004.1417525
  • Johnson, B. A., Tateishi, R., & Xie, Z. (2012). Using geographically weighted variables for image classification. Remote Sensing Letters, 3(6), 491–499. https://doi.org/10.1080/01431161.2011.629637
  • Katiyar, G., & Mehfuz, S. (2015). SVM based off-line handwritten digit recognition. In 2015 Annual IEEE India Conference (INDICON) (pp. 1–5). https://doi.org/10.1109/INDICON.2015.7443398
  • Koklu, M., & Ozkan, I. A. (2020). Multiclass classification of dry beans using computer vision and machine learning techniques. Computers and Electronics in Agriculture, 174, 105507. https://doi.org/10.1016/j.compag.2020.105507
  • Kothari, S., & Oh, H. (1993). Neural networks for pattern recognition. In M. C. Yovits (Ed.) (Vol. 37, pp. 119–166). Elsevier.
  • Krzysztof J. Cios, L. A. K. (2001). SPECTF Heart Data Set. https://archive.ics.uci.edu/ml/datasets/SPECTF+Heart
  • Li, H., Zhao, H., & Li, H. (2019). Neural-response-based extreme learning machine for image classification. IEEE Transactions on Neural Networks and Learning Systems, 30(2), 539–552. https://doi.org/10.1109/TNNLS.2018.2845857
  • Liu, J., Fan, L., Jia, Q., Wen, L., & Shi, C. (2021). Early diabetes prediction based on stacking ensemble learning model. In 2021 33rd Chinese Control and Decision Conference (CCDC) (pp. 2687–2692). https://doi.org/10.1109/CCDC52312.2021.9601932
  • Lohweg, V. (2012). Banknote authentication data set. https://archive.ics.uci.edu/ml/datasets/banknote+authenticationl
  • Madan, H. (2022). 3-wine classification dataset. https://www.kaggle.com/datasets/tug004/3wine-classification-dataset
  • Mehra, A. (2022a). 3-wine classification dataset. https://www.kaggle.com/datasets/tug004/3wine-classification-dataset
  • Mehra, A. (2022b). roomoccupancy. https://www.kaggle.com/datasets/aahanmehra/roomoccupancy
  • Mounica, R. O., Soumya, V., Krovvidi, S., Chandrika, K. S., & Gayathri, R. (2019). A multi layer ensemble learning framework for learning disability detection in school-aged children. In 2019 10th International Conference on Computing, Communication and Networking Technologies (ICCCNT) (pp. 1–6). https://doi.org/10.1109/ICCCNT45670.2019.8944774
  • Naqi, S. M., Sharif, M., & Yasmin, M. (2018). Multistage segmentation model and SVM-ensemble for precise lung nodule detection. International Journal of Computer Assisted Radiology and Surgery, 13(7), 1083–1095. https://doi.org/10.1007/s11548-018-1715-9
  • Nekouei, M., & Sartoli, S. (2019). Modeling the structured porous network using stacked ensemble learning. In 2019 IEEE 43rd Annual Computer Software and Applications Conference (COMPSAC) (Vol. 2, pp. 80–84). https://doi.org/10.1109/COMPSAC.2019.10187
  • Oloso, M. A., Hassan, M. G., Bader-El-Den, M. B., & Buick, J. M. (2018). Ensemble SVM for characterisation of crude oil viscosity. Journal of Petroleum Exploration and Production Technology, 8, 531–543. https://doi.org/10.1007/s13202-017-0355-x
  • Patrício, M., Pereira, J., Crisóstomo, J., Matafome, P., Gomes, M., Seiça, R., & Caramelo, F. (2018, January). Using resistin, glucose, age and BMI to predict the presence of breast cancer. BMC Cancer, 18(1), 29. 10.1186/s12885-017-3877-1
  • Rahman, R. (2022). Heart Attack Analysis & Prediction Dataset. https://www.kaggle.com/datasets/rashikrahmanpritom/heart-attack-analysis-prediction-dataset
  • Sejnowski, T. J., & Rosenberg, C. R. (1987). Parallel networks that learn to pronounce English text. Complex System, 1(1), 145–168.
  • Shapiro, A. (1989). Chess (King-Rook vs. King-Pawn) Data Set. https://archive.ics.uci.edu/ml/datasets/Chess+King-Rook+vs.+King-Pawn
  • Sohail, S., Fan, Z., Gu, X., & Sabrina, F. (2022). Multi-tiered artificial neural networks model for intrusion detection in smart homes. Intelligent Systems with Applications, 16, 200152. https://doi.org/10.1016/j.iswa.2022.200152
  • Sun, Y., & Fu, L. (2022). Stacking ensemble learning for non-line-of-sight detection of global navigation satellite system. IEEE Transactions on Instrumentation and Measurement, 71, 1–10. https://doi.org/10.1109/TIM.2022.3170985
  • Ukey, K. P., & Alvi, A. S. (2012). Text classification using support vector machine. Journal of Software Engineering & Applications, 5(12), 55–58. https://doi.org/10.4236/jsea.2012.512B012
  • Vapnik, V. (2000). SVM method of estimating density, conditional probability, and conditional density. In 2000 IEEE International Symposium on Circuits and Systems (ISCAS) (Vol. 2, pp. 749–752). https://doi.org/10.1109/ISCAS.2000.856437
  • Vapnik, V. N. (1995). The nature of statistical learning theory. Springer-Verlag New York, Inc.
  • Wang, J., Zhao, C., Huo, Z., Qiao, Y., & Sima, H. (2022). High quality proposal feature generation for crowded pedestrian detection. Pattern Recognition, 128(5), 108605. https://doi.org/10.1016/j.patcog.2022.108605
  • Wang, K., Liu, X., Zhao, J., Gao, H., & Zhang, Z. (2020). Application research of ensemble learning frameworks. In 2020 Chinese Automation Congress (CAC) (pp. 5767–5772). https://doi.org/10.1109/CAC51589.2020.9326882
  • Wolpert, D. H. (1992). Stacked generalization. Neural Networks, 5(2), 241–259.https://doi.org/10.1016/S0893-6080(05)80023-1
  • Woods, K., Kegelmeyer, W., & Bowyer, K. (1997). Combination of multiple classifiers using local accuracy estimates. IEEE Transactions on Pattern Analysis and Machine Intelligence, 19(4), 405–410. https://doi.org/10.1109/34.588027
  • Zhang, T., & Li, J. (2021). Credit risk control algorithm based on stacking ensemble learning. In 2021 IEEE International Conference on Power Electronics, Computer Applications (ICPECA) (pp. 668–670). https://doi.org/10.1109/ICPECA51329.2021.9362514
  • Zhou, Z. H. (2021). Ensemble learning. In Machine learning (pp. 181–210). Springer Singapore. https://doi.org/10.1007/978-981-15-1967-3_8