433
Views
4
CrossRef citations to date
0
Altmetric
Articles

Concept Drift Monitoring and Diagnostics of Supervised Learning Models via Score Vectors

ORCID Icon, ORCID Icon &
Pages 137-149 | Received 18 Jul 2021, Accepted 28 Jul 2022, Published online: 31 Oct 2022

References

  • Baena-Garc ia, M., del Campo-Ávila, J., Fidalgo, R., Bifet, A., Gavalda, R., and Morales-Bueno, R. (2006), “Early Drift Detection Method,” in Fourth International Workshop on Knowledge Discovery from Data Streams (Vol. 6), pp. 77–86.
  • Barros, R., S. M., and Santos, S. G. T. C. (2018), “A Large-Scale Comparison of Concept Drift Detectors,” Information Sciences, 451, 348–370. DOI: 10.1016/j.ins.2018.04.014.
  • Bickel, P. J., and Doksum, K. A. (2015), Mathematical Statistics: Basic Ideas and Selected Topics, Chapman & Hall/CRC Texts in Statistical Science Book 117 (Vol. 1), Boca Raton, FL: CRC Press.
  • Bifet, A., and Gavalda, R. (2007), “Learning from Time-Changing Data with Adaptive Windowing,” in Proceedings of the 2007 SIAM International Conference on Data Mining, pp. 443–448, SIAM. DOI: 10.1137/1.9781611972771.42.
  • Bottou, L., Curtis, F. E., and Nocedal, J. (2018), “Optimization Methods for Large-Scale Machine Learning,” SIAM Review, 60, 223–311. DOI: 10.1137/16M1080173.
  • Box, G. E. P. (1976), “Science and Statistics,” Journal of the American Statistical Association, 71, 791–799. DOI: 10.1080/01621459.1976.10480949.
  • Calandra, R., Raiko, T., Deisenroth, M. P., and Pouzols, F. M. (2012), “Learning Deep Belief Networks from Non-stationary Streams,” in International Conference on Artificial Neural Networks, pp. 379–386, Springer.
  • Carmona-Cejudo, J. M., Baena-García, M., del Campo-Ávila, J., Morales-Bueno, R., and Bifet, A. (2010), “GNUsmail: Open Framework for On-line Email Classification,” in Frontiers in Artificial Intelligence and Applications 215, pp. 1141–1142, IOS Press.
  • Crook, J. N., Thomas, L. C., and Hamilton, R. (1992), “The Degradation of the Scorecard Over the Business Cycle,” IMA Journal of Management Mathematics, 4, 111–123. DOI: 10.1093/imaman/4.1.111.
  • Donoho, S. (2004), “Early Detection of Insider Trading in Option Markets,” in Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 420–429.
  • Fong, Y., Di, C., and Permar, S. (2015), “Change Point Testing in Logistic Regression Models with Interaction Term,” Statistics in Medicine, 34, 1483–1494. DOI: 10.1002/sim.6419.
  • Frías-Blanco, I., del Campo-Ávila, J., Ramos-Jiménez, G., Morales-Bueno, R., Ortiz-Díaz, A., and Caballero-Mota, Y. (2015), “Online and Non-parametric Drift Detection Methods based on Hoeffding’s Bounds,” IEEE Transactions on Knowledge and Data Engineering, 27, 810–823. DOI: 10.1109/TKDE.2014.2345382.
  • Gama, J., Medas, P., Castillo, G., and Rodrigues, P. (2004), “Learning with Drift Detection,” in Brazilian Symposium on Artificial Intelligence, pp. 286–295, Springer.
  • Gonçalves, Jr., P. M,. de Carvalho Santos, S. G. T., Barros, R. S. M., and Vieira, D. C. L. (2014), “A Comparative Study on Concept Drift Detectors,” Expert Systems with Applications, 41, 8144–8156. DOI: 10.1016/j.eswa.2014.07.019.
  • Harel, M., Mannor, S., El-Yaniv, R., and Crammer, K. (2014), “Concept Drift Detection through Resampling,” in International Conference on Machine Learning, pp. 1009–1017.
  • Hotelling, H. (1947), “Multivariate Quality Control Illustrated by Air Testing of Sample Bombsights,” In Techniques of Statistical Analysis, eds. C. Eisenhart, M. W. Hastay, and W. A. Wallis, pp. 111–184, New York: McGraw-Hill.
  • Hulten, G., Spencer, L., and Domingos, P. (2001), “Mining Time-Changing Data Streams,” in Proceedings of the seventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 97–106. DOI: 10.1145/502512.502529.
  • Im, J. K., Apley, D. W., Qi, C., and Shan, X. (2012), “A Time-Dependent Proportional Hazards Survival Model for Credit Risk Analysis,” Journal of the Operational Research Society, 63, 306–321. DOI: 10.1057/jors.2011.34.
  • Kingma, D. P., and Ba, J. (2014), “Adam: A Method for Stochastic Optimization,” arXiv preprint arXiv:1412.6980.
  • Kuan, C.-M., and Hornik, K. (1995), “The Generalized Fluctuation Test: A Unifying View,” Econometric Reviews, 14, 135–161. DOI: 10.1080/07474939508800311.
  • Kukar, M. (2003), “Drifting Concepts as Hidden Factors in Clinical Studies,” in Conference on Artificial Intelligence in Medicine in Europe, 355–364, Springer.
  • Lowry, C. A., Woodall, W. H., Champ, C. W., and Rigdon, S. E. (1992), “A Multivariate Exponentially Weighted Moving Average Control Chart,” Technometrics, 34, 46–53. DOI: 10.2307/1269551.
  • Lu, J., Liu, A., Dong, F., Gu, F., Gama, J., and Zhang, G. (2018), “Learning under Concept Drift: A Review,” IEEE Transactions on Knowledge and Data Engineering, 31, 2346–2363. DOI: 10.1109/TKDE.2018.2876857.
  • Mejri, D., Limam, M., and Weihs, C. (2018), “A New Dynamic Weighted Majority Control Chart for Data Streams,” Soft Computing, 22, 511–522. DOI: 10.1007/s00500-016-2351-3.
  • Mejri, D., Limam, M., and Weihs, C. (2021), “A New Time Adjusting Control Limits Chart for Concept Drift Detection,” IFAC Journal of Systems and Control, 17, 100170.
  • Montgomery, D. C. (2007), Introduction to Statistical Quality Control, New York: Wiley.
  • Moreno-Torres, J. G., Raeder, T., Alaiz-RodríGuez, R., Chawla, N. V., and Herrera, F. (2012), “A Unifying View on Dataset Shift in Classification,” Pattern Recognition, 45, 521–530. DOI: 10.1016/j.patcog.2011.06.019.
  • Nelder, J. A., and Wedderburn, R. W. M. (1972), “Generalized Linear Models,” Journal of the Royal Statistical Society, Series A, 135, 370–384. DOI: 10.2307/2344614.
  • Pechenizkiy, M., Tsymbal, A., Puuronen, S., Shifrin, M., and Alexandrova, I. (2005), “Knowledge Discovery from Microbiology Data: Many-Sided Analysis of Antibiotic Resistance in Nosocomial Infections,” in Biennial Conference on Professional Knowledge Management/Wissensmanagement, pp. 360–372, Springer.
  • Raza, H., Prasad, G., and Li, Y. (2015), “EWMA Model based Shift-Detection Methods for Detecting Covariate Shifts in Non-stationary Environments,” Pattern Recognition, 48, 659–669. DOI: 10.1016/j.patcog.2014.07.028.
  • Roberts, S. W. (1959), “Control Chart Tests based on Geometric Moving Averages,” Technometrics, 1, 239–250. DOI: 10.1080/00401706.1959.10489860.
  • Ross, G. J., Adams, N. M., Tasoulis, D. K., and Hand, D. J. (2012), “Exponentially Weighted Moving Average Charts for Detecting Concept Drift,” Pattern Recognition Letters, 33, 191–198. DOI: 10.1016/j.patrec.2011.08.019.
  • Shannon, C. E. (1948), “A Mathematical Theory of Communication,” Bell System Technical Journal, 27, 379–423. DOI: 10.1002/j.1538-7305.1948.tb01338.x.
  • Towns, J., Cockerill, T., Dahan, M., Foster, I., Gaither, K., Grimshaw, A., Hazlewood, V., Lathrop, S., Lifka, D., Peterson, G. D., and Roskies, R. (2014), “XSEDE: Accelerating Scientific Discovery,” Computing in Science & Engineering, 16, 62–74.
  • Tsymbal, A. (2004), “The Problem of Concept Drift: Definitions and Related Work,” Computer Science Department, Trinity College Dublin, 106, 2.
  • Tsymbal, A., Pechenizkiy, M., Cunningham, P., and Puuronen, S. (2008), “Dynamic Integration of Classifiers for Handling Concept Drift,” Information Fusion, 9, 56–68. DOI: 10.1016/j.inffus.2006.11.002.
  • Wang, H., and Abraham, Z. (2015), “Concept Drift Detection for Streaming Data,” in 2015 International Joint Conference on Neural Networks (IJCNN), pp. 1–9. IEEE.
  • Wang, H., Fan, W., Yu, P. S., and Han, J. (2003), “Mining Concept-Drifting Data Streams using Ensemble Classifiers,” in Proceedings of the Ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 226–235. ACM. DOI: 10.1145/956750.956778.
  • Widmer, G., and Kubat, M. (1996), “Learning in the Presence of Concept Drift and Hidden Contexts,” Machine Learning, 23, 69–101. DOI: 10.1007/BF00116900.
  • Xia, Z., Guo, P., and Zhao, W. (2009), “Monitoring Structural Changes in Generalized Linear Models,” Communications in Statistics—Theory and Methods, 38, 1927–1947. DOI: 10.1080/03610920802549910.
  • Zeileis, A. (2005), “A Unified Approach to Structural Change Tests based on ML Scores, F Statistics, and OLS Residuals,” Econometric Reviews, 24, 445–466. DOI: 10.1080/07474930500406053.
  • Zeileis, A., and Hornik, K. (2007), “Generalized M-Fluctuation Tests for Parameter Instability,” Statistica Neerlandica, 61, 488–508. DOI: 10.1111/j.1467-9574.2007.00371.x.
  • Žliobaitė, I., Bakker, J., and Pechenizkiy, M. (2012), “Beating the Baseline Prediction in Food Sales: How Intelligent an Intelligent Predictor is?,” Expert Systems with Applications, 39, 806–815. DOI: 10.1016/j.eswa.2011.07.078.
  • Žliobaitė, I., Pechenizkiy, M., and Gama, J. (2016), “An Overview of Concept Drift Applications,” in Big Data Analysis: New Algorithms for a New Society, eds. N. Japkowicz and J. Stefanowski, pp. 91–114, Cham: Springer.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.