1,355
Views
0
CrossRef citations to date
0
Altmetric
Articles

Machine learning in concept drift detection using statistical measures

&
Pages 281-291 | Received 31 Jul 2023, Accepted 27 Nov 2023, Published online: 15 Dec 2023

References

  • Khamassi I, Sayed-Mouchaweh M, Hammami M, et al. Discussion and review on evolving data streams and concept drift adapting. Evol Syst. 2018;9(1):1–23. doi:10.1007/s12530-016-9168-2
  • Demšar J, Bosnić Z. Detecting concept drift in data streams using model explanation. Expert Syst Appl. 2018;92:546–559. doi:10.1016/j.eswa.2017.10.003
  • Rossi ALD, Souza BFD, Soares C, et al. A guidance of data stream characterization for meta-learning. Intell Data Anal. 2017;21(4):1015–1035. doi:10.3233/IDA-160083
  • Cabral DRdL, Barros RSMd. Concept drift detection based on Fisher’s exact test. Inf Sci (Ny). 2018;442–443:220–234. doi:10.1016/j.ins.2018.02.054
  • Jadhav A, Deshpande L. An efficient approach to detect concept drifts in data streams. In: Proceedings – 7th IEEE International Advanced Computing Conference, IACC 2017; 2017. p. 28–32.
  • Kelly MG, Hand DJ, Adams NM. The impact of changing populations on classifier performance. Proc Fifth ACM SIGKDD Int Conf Knowl Discov Data Min – KDD ‘99. 1999;32(2):367–371. doi:10.1145/312129.312285
  • Masud MM, Gao J, Khan L, et al. Integrating novel class detection with classification for concept-drifting data streams. In: Lecture notes in computer science (including subseries Lecture notes in artificial intelligence and Lecture notes in bioinformatics); Vol. 5782; LNAI (PART 2); 2009. p. 79–94.
  • Kolter JZ, Maloof MA. Dynamic weighted majority: an ensemble method for drifting concepts. J Mach Learn Res. 2007;8(91):2755–2790.
  • Delany SJ, Cunningham P, Tsymbal A, et al. A case-based technique for tracking concept drift in spam filtering. Appl Innov Intell Syst. 2005; XII:3–16.
  • Tsymbal A, Pechenizkiy M, Cunningham P, et al. Dynamic integration of classifiers for handling concept drift. Inf Fusion. 2008;9(1):56–68. doi:10.1016/j.inffus.2006.11.002
  • Widmer G, Kubat M. Learning in the presence of concept drift and hidden contexts. Mach Learn. 1996;23(1):69–101.
  • Tsymbal A. The problem of concept drift : definitions and related work. Technical report. Dublin: Department of Computer Science, Trinity College Dublin; 2004.
  • Gama J, Castillo G. Learning with local drift detection. In: Lecture notes in computer science (including subseries Lecture notes in artificial intelligence and Lecture notes in bioinformatics); Vol. 4093; LNAI; 2006. p. 42–55.
  • Aggarwal CC, Yu PS. On clustering techniques for change diagnosis in data streams. In: Lecture notes in computer science (including subseries Lecture notes in artificial intelligence and Lecture notes in bioinformatics); Vol. 4198; LNAI; 2006. p. 139–157.
  • Rutkowski L, Jaworski M, Pietruczuk L, et al. A new method for data stream mining based on the misclassification error. IEEE Trans Neural Netw Learn Syst. 2015;26(5):1048–1059. doi:10.1109/TNNLS.2014.2333557
  • Tchicaya AF, Wognin SB, Aka INA, et al. Conditions d’exposition professionnelle des secrétaires d’une entreprise du secteur privé aux douleurs du rachis et des membres supérieurs à Abidjan, Côte d’Ivoire. Arch Mal Prof Environ. 2015;76(4):345–351. doi:10.1016/j.admp.2015.01.007
  • Frías-Blanco I, Campo-Ávila JD, Ramos-Jiménez G, et al. Online and non-parametric drift detection methods based on Hoeffding’s bounds. IEEE Trans Knowl Data Eng. 2015;27(3):810–823. doi:10.1109/TKDE.2014.2345382
  • Zhu X, Zhang P, Lin X, et al. Active learning from stream data using optimal weight classifier ensemble. IEEE Trans Syst Man Cybern B Cybern. 2010;40(6):1607–1621. doi:10.1109/TSMCB.2010.2042445
  • Abdu NAA, Basulaim KO. A review of tracking concept drift detection in machine learning. Recent Trends Comput Sci. 2023: 36–41. doi:10.1201/9781003363781-6
  • Baena-García M, Campo-’ Avila JD, Fidalgo R, et al. Early drift detection method. 4th ECML PKDD Int Work Knowl Discov Data Streams. 2006;6:77–86.
  • Minku LL, Yao X. DDD: A new ensemble approach for dealing with concept drift. IEEE Trans Knowl Data Eng. 2012;24(4):619–633. doi:10.1109/TKDE.2011.58
  • Sun Y, Tang K, Zhu Z, et al. Concept drift adaptation by exploiting historical knowledge. IEEE Trans Neural Netw Learn Syst. 2018;29(10):4822–4832. doi:10.1109/TNNLS.2017.2775225
  • Sidhu P, Bhatia M. Online approach to handle concept drifting data streams using diversity. Int Arab J Inf Technol (IAJIT). 2017;14(3).
  • Murthy SK. Automatic construction of decision trees from data: a multi-disciplinary survey. Data Min Knowl Discov. 1998;2(4):345–389. doi:10.1023/A:1009744630224
  • Law YN, Zaniolo C. An adaptive nearest neighbor classification algorithm for data streams. In: Lecture notes in computer science (including subseries Lecture notes in artificial intelligence and Lecture notes in bioinformatics); Vol. 3721; LNAI; 2005. p. 108–120.
  • Friedman N, Geiger D, Goldszmidt M. Bayesian network classifiers. Mach Learn. 1997;29(2–3):131–163. doi:10.1023/A:1007465528199
  • Yu Z, Chen H, Liuxs J, et al. Hybrid κ-nearest neighbor classifier. IEEE Trans Cybern. 2016;46(6):1263–1275. doi:10.1109/TCYB.2015.2443857
  • Suykens JAK, Vandewalle J. Least squares support vector machine classifiers. Neural Process Lett. 1999;9(3):293–300. doi:10.1023/A:1018628609742
  • Domingos P, Hulten G. Mining high-speed data streams. In: Proceedings of the Sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining; 2000. p. 71–80.
  • Wang H, Fan W, Yu PS, et al. Mining concept-drifting data streams using ensemble classifiers. In: Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining; 2003, p. 226–235.
  • Wang J, Peng J, Liu O. A classification approach for less popular webpages based on latent semantic analysis and rough set model. Expert Syst Appl. 2015;42(1):642–648. doi:10.1016/j.eswa.2014.08.013
  • Pang S, Ozawa S, Kasabov N. Incremental linear discriminant analysis for classification of data streams. IEEE Trans Syst Man Cybern B Cybern. 2005;35(5):905–914. doi:10.1109/TSMCB.2005.847744
  • Gama J, Gaber MM, editors. Learning from data streams: processing techniques in sensor networks. Springer Science & Business Media; 2007.
  • Angelov P. Evolving Takagi-Sugeno fuzzy systems from streaming data (eTS+). Evol Intell Syst Methodol Appl. 2010:21–50. doi:10.1002/9780470569962.ch2
  • Lughofer E. Evolving fuzzy systems methodologies, advanced concepts and applications. Stud Fuzziness Soft Comput. 2011;266:1–478. doi:10.1007/978-3-642-18087-3_1
  • Senge R, Hüllermeier E. Top-down induction of fuzzy pattern trees. IEEE Trans Fuzzy Syst. 2011;19(2):241–252. doi:10.1109/TFUZZ.2010.2093532
  • Shaker A, Senge R, Hüllermeier E. Evolving fuzzy pattern trees for binary classification on data streams. Inf Sci (Ny). 2013;220:34–45. doi:10.1016/j.ins.2012.02.034
  • Jiang P, Liu F, Song Y. A hybrid forecasting model based on date-framework strategy and improved feature selection technology for short-term load forecasting. Energy. 2017;119:694–709. doi:10.1016/j.energy.2016.11.034
  • Yang X-S, Deb S. Cuckoo search for optimization and computational intelligence. In: Encyclopedia of information science and technology. 3rd ed. IGI global; 2015. p. 133–142.
  • Saida IB, Kamel N, Omar B. A new hybrid algorithm for document clustering based on cuckoo search and k-means. Adv Intell Syst Comput. 2014;287:59–68. doi:10.1007/978-3-319-07692-8_6
  • Wozniak M, Polap D, Kosmider L, et al. A novel approach toward x-ray images classifier. In: Proceedings – 2015 IEEE Symposium Series on Computational Intelligence, SSCI 2015; 2015. p. 1635–1641.
  • Onan A. Hybrid supervised clustering based ensemble scheme for text classification. Kybernetes. 2017;46(2):330–348. doi:10.1108/K-10-2016-0300
  • Agrahari S, Singh AK. Concept drift detection in data stream mining: a literature review. J King Saud Univ Comput Inf Sci. 2022;34(10):9523–9540. doi:10.1016/j.jksuci.2021.11.006
  • Kosina P, Gama J, Sebastio R. Drift severity metric. Front Artif Intell Appl. 2010;215:1119–1120.
  • Lovric M. International encyclopedia of statistical science. Berlin: Springer Heidelberg; 2011.
  • Hoens TR, Chawla NV, Polikar R. Heuristic updatable weighted random subspaces for non-stationary environments. In: Proceedings – IEEE International Conference on Data Mining, ICDM; 2011. p. 241–250.
  • Dongre PB, Malik LG. A review on real time data stream classification and adapting to various concept drift scenarios. In: Souvenir of the 2014 IEEE International Advance Computing Conference, IACC 2014; 2014. p. 533–537.
  • Brzezinski D, Stefanowski J. Reacting to different types of concept drift: the accuracy updated ensemble algorithm. IEEE Trans Neural Netw Learn Syst. 2014;25(1):81–94. doi:10.1109/TNNLS.2013.2251352
  • Wang H, Abraham Z. Concept drift detection for streaming data. Proceedings of the International Joint Conference on Neural Networks 2015; 2015 Sept.
  • Bifet A, Gavaldà R. Learning from time-changing data with adaptive windowing. In: Proceedings; 2007. p. 443–448.
  • Fenza G, Gallo M, Loia V, et al. Concept-drift detection index based on fuzzy formal concept analysis for fake news classifiers. Technol Forecast Soc Change. 2023;194:122640. doi:10.1016/j.techfore.2023.122640
  • Bach SH, Maloof MA. Paired learners for concept drift. In: Proceedings – IEEE International Conference on Data Mining, ICDM; 2008. p. 23–32.
  • Harel M, Mannor S, El-Yaniv R, et al. Concept drift detection through resampling. International Conference on Machine Learning, PMLR; 2014. p. 1009–1017.
  • Ijcai RK. A study of cross-validation and bootstrap for accuracy estimation and model selection. researchgate.net; 1995.
  • Spinosa EJ, Carvalho APDLFD, Gama J. OLINDDA: a cluster-based approach for detecting novelty and concept drift in data streams. In: Proceedings of the ACM Symposium on Applied Computing; 2007. p. 448–452.
  • Ryu JW, Kantardzic MM, Kim MW, et al. An efficient method of building an ensemble of classifiers in streaming data. In: Lecture notes in computer science (including subseries Lecture notes in artificial intelligence and Lecture notes in bioinformatics); Vol. 7678; LNCS; 2012. p. 122–133.
  • Masud M, Gao J, Khan L, et al. Classification and novel class detection in concept-drifting data streams under time constraints. IEEE Trans Knowl Data Eng. 2011;23(6):859–874. doi:10.1109/TKDE.2010.61
  • Lee J, Magoulès F. Detection of concept drift for learning from stream data. In: Proceedings of the 14th IEEE International Conference on High Performance Computing and Communications, HPCC-2012 – 9th IEEE International Conference on Embedded Software and Systems, ICESS 2012; 2012. p. 241–245.
  • Ditzler G, Polikar R. Hellinger distance based drift detection for nonstationary environments. In: IEEE SSCI 2011: Symposium Series on Computational Intelligence – CIDUE 2011: 2011 IEEE Symposium on Computational Intelligence in Dynamic and Uncertain Environments; 2011. p. 41–48.
  • Kuncheva LI, Faithfull WJ. PCA feature extraction for change detection in multidimensional unlabeled data. IEEE Trans Neural Netw Learn Syst. 2014;25(1):69–80. doi:10.1109/TNNLS.2013.2248094
  • Qahtan A, Alharbi B, Wang S, et al. A PCA-based change detection framework for multidimensional data streams. In: Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining 2015; 2015 Aug. p. 935–944.
  • Prasad KSN, Rao AS, Ramana AV. Ensemble framework for concept-drift detection in multidimensional streaming data. Int J Comput Appl. 2022;44(12):1193–1200. doi:10.1080/1206212X.2020.1711617
  • Badhon B, Kabir MMJ, Xu S, et al. A survey on association rule mining based on evolutionary algorithms. Int J Comput Appl. 2021;43(8):775–785. doi:10.1080/1206212X.2019.1612993
  • Chen L, Gao S, Cao X. Research on real-time outlier detection over big data streams. Int J Comput Appl. 2020;42(1):93–101. doi:10.1080/1206212X.2017.1397388
  • Feyzi F, Daneshdoost A. Studying the effectiveness of deep active learning in software defect prediction. Int J Comput Appl. 2023;45(7–8):534–552. doi:10.1080/1206212X.2023.2252117
  • Lindstrom P, Namee BM, Delany SJ. Drift detection using uncertainty distribution divergence. Evolving Systems. 2013;4(1):13–25. doi:10.1007/s12530-012-9061-6
  • Žliobaite I. Change with delayed labeling: when is it detectable? In: Proceedings – IEEE International Conference on Data Mining, ICDM; 2010. p. 843–850.
  • Gretton A, Borgwardt K, Rasch MJ, et al. A Kernel method for the two-sample problem. arXiv:0805.2368; 2008.
  • Leeuwen MV, Siebes A. StreamKrimp: detecting change in data streams. In: Lecture notes in computer science (including subseries Lecture notes in artificial intelligence and Lecture notes in bioinformatics); Vol. 5211; LNAI (PART 1); 2008. p. 672–687.
  • Dries A, Rückert U. Adaptive concept drift detection. Stat Anal Data Min Asa Data Sci J. 2009;2(5–6):311–327. doi:10.1002/sam.10054
  • Dredze M, Oates T, Piatko C. We’re not in Kansas anymore: detecting domain changes in streams. In: Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing; 2010. p. 585–595.
  • Sethi TS, Kantardzic M. Don’t pay for validation: detecting drifts from unlabeled data using margin density. Procedia Comput Sci. 2015;53(1):103–112. doi:10.1016/j.procs.2015.07.284
  • Costa AFJ, Albuquerque RAS, Santos EMD. A drift detection method based on active learning. Proceedings of the International Joint Conference on Neural Networks; 2018 Jul.
  • Ross GJ, Adams NM, Tasoulis DK, et al. Exponentially weighted moving average charts for detecting concept drift. Pattern Recognit Lett. 2012;33(2):191–198. doi:10.1016/j.patrec.2011.08.019
  • Sobhani P, Beigy H. New drift detection method for data streams. In: Lecture notes in computer science (including subseries Lecture notes in artificial intelligence and Lecture notes in bioinformatics); Vol. 6943; LNAI; 2011. p. 88–97.
  • Faria ER, Gama J, Carvalho APLF. Novelty detection algorithm for data streams multi-class problems. In: Proceedings of the ACM Symposium on Applied Computing; 2013. p. 795–800.
  • Hayat MZ, Hashemi MR. A DCT based approach for detecting novelty and concept drift in data streams. In: Proceedings of the 2010 International Conference of Soft Computing and Pattern Recognition, SoCPaR; 2010. p. 373–378.
  • Sethi TS, Kantardzic M, Hu H. A grid density based framework for classifying streaming data in the presence of concept drift. J Intell Inf Syst. 2016;46(1):179–211. doi:10.1007/s10844-015-0358-3
  • Ghasemi A, Zahediasl S. Normality tests for statistical analysis: a guide for non-statisticians. Int J Endocrinol Metab. 2012;10(2):486. doi:10.5812/ijem.3505
  • Guyon I, De AM. An introduction to variable and feature selection André Elisseeff. J Mach Learn Res. 2003;3:1157–1182.
  • Üniversitesi A, Ve B, Dergisi T, et al. A modified T-score for feature selection. Anadolu Üniversitesi Bilim Ve Teknoloji Dergisi A Uygulamalı Bilimler ve Mühendislik. 2016;17(5):845–852.
  • Kummer O, Savoy J. Feature selection in sentiment analysis. In: Proceedings of the 9th French Information Retrieval Conference. p. 273–284.
  • Riedel T, Sahoo PK. Mean value theorems and functional equations. World Scientific; 1998.
  • Gerstman B. t-table; 2007. Available from: www.sjsu.edu/faculty/gerstman/StatPrimer/t-table.pdf
  • Ihaka R, Gentleman R. R: a language for data analysis and graphics. J Comput Graph Stat. 1996;5(3):299–314.
  • Lichman M. UCI machine learning repository; 2013. Available from: https://archive.ics.uci.edu/ml/datasets/Poker+Hand
  • An TK, Kim MH. A new diverse AdaBoost classifier. Proc Int Conf Artif Intell Comput Intell AICI 2010. 2010;1:359–363.
  • Koonsanit K, Jaruskulchai C. X-means: extending K-means with efficient estimation of the number of clusters. Proc 17th Int Conf Machine Learn. 2000;E95-D(5):727–734.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.