0
Views
0
CrossRef citations to date
0
Altmetric
Research Article

A continual learning framework to train robust image recognition models by adversarial training and knowledge distillation

, , &
Article: 2379268 | Received 26 Aug 2023, Accepted 08 Jul 2024, Published online: 20 Jul 2024

References

  • Aldahdooh, A., Hamidouche, W., Fezza, S. A., & Déforges, O. (2022). Adversarial example detection for DNN models: A review and experimental comparison. Artificial Intelligence Review, 55, 4403–4462. https://doi.org/10.1007/s10462-021-10125-w
  • Cai, Q. Z., Du, M., Liu, C., & Song, D. (2018). Curriculum adversarial training. arXiv preprint arXiv:1805.04807v1.
  • Chou, T.-C., Huang, J.-Y., & Lee, W.-P. (2022). Continual learning with adversarial training to enhance robustness of image recognition models. Proceedings of the 21st International Conference on Cyberworlds, 236–242.
  • Cossu, A., Carta, A., Lomonaco, V., & Bacciu, D. (2021). Continual learning for recurrent neural networks: An empirical evaluation. Neural Networks, 143, 607–627. https://doi.org/10.1016/j.neunet.2021.07.021
  • Croce, F., & Hein, M. (2020). Reliable evaluation of adversarial robustness with an ensemble of diverse parameter-free attacks. Proceedings of the 37th International Conference on Machine Learning, 2206–2216.
  • Crowley, E. J., Gray, G., & Storkey, A. J. (2018). Moonshine: Distilling with cheap convolutions. Proceedings of the 32nd International Conference on Neural Information Processing Systems, 2893–2903.
  • Du, F., Yang, Y., Zhao, Z., & Zeng, Z. (2023). Efficient perturbation inference and expandable network for continual learning. Neural Networks, 159, 97–106. https://doi.org/10.1016/j.neunet.2022.10.030
  • Furlanello, T., Lipton, Z., Tschannen, M., Itti, L., & Anandkumar, A. (2018). Born again neural networks. Proceedings of the 35th International Conference on Machine Learning, 1607–1616.
  • Gao, M. (2023). A survey on recent teacher-student learning studies. Journal of the ACM, 37, article 111.
  • Gao, M., Wang, Y., & Wan, L. (2021). Residual error-based knowledge distillation. Neurocomputing, 433, 154–161. https://doi.org/10.1016/j.neucom.2020.10.113
  • Goodfellow, I., Shlens, J., & Szegedy, C. (2015). Explaining and harnessing adversarial examples. Proceedings of the 3rd International Conference on Learning Representation.
  • Gou, J., Yu, B., Maybank, S. J., & Tao, D. (2021). Knowledge distillation: A survey. International Journal of Computer Vision, 129, 1789–1819. https://doi.org/10.1007/s11263-021-01453-z
  • Guo, Y., Liu, B., & Zhao, D. (2023). Dealing with cross-task class discrimination in online continual learning. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 11878–11887.
  • Han, S., Lin, Y., Guo, Z., & Lv, K. (2023). A lightweight and stylerobust neural network for autonomous driving in end side devices. Connection Science, 35, 2155613. https://doi.org/10.1080/09540091.2022.2155613
  • He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 770–778.
  • Hendrycks, D., Lee, K., & Mazeika, M. (2019). Using pre-training can improve model robustness and uncertainty. Proceedings of the 36th International Conference on Machine Learning, 2712–2721.
  • Hinton, G. E., & Salakhutdinov, R. R. (2006). Reducing the dimensionality of data with neural networks. Science, 313, 504–507. https://doi.org/10.1126/science.1127647
  • Hinton, G., Vinyals, O., & Dean, J. (2015). Distilling the knowledge in a neural network. arXiv preprint arXiv:1503.02531v1.
  • Hung, C. Y., Tu, C.-H., Wu, C.-E., Chen, C.-H., Chan, Y.-M., & Chen, C.-S. (2019). Compacting, picking and growing for unforgetting continual learning. Proceedings of the 30th International Conference on Neural Information Processing Systems, 13647–13657.
  • Incel, O. D., & Bursa, SÖ. (2023). On-device deep learning for mobile and wearable sensing applications: A review. IEEE Sensors Journal, 23, 5501–5512. https://doi.org/10.1109/JSEN.2023.3240854
  • Joseph, K. J., Khan, S., Khan, F. S., & Balasubramanian, V. N. (2021). Towards open world object detection. Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, 5830–5840.
  • Krizhevsky, A., & Hinton, G. (2009). Learning multiple layers of features from tiny images. Technical Report TR-2009. University of Toronto.
  • Kurakin, A., Goodfellow, I., & Bengio, S. (2017). Adversarial examples in the physical world. Proceedings of the 5th International Conference on Learning Representation Workshop Track.
  • Liu, A., Liu, X., Yu, H., Zhang, C., Liu, Q., & Tao, D. (2021). Training robust deep neural networks via adversarial noise propagation. IEEE Transactions on Image Processing, 30, 5769–5781. https://doi.org/10.1109/TIP.2021.3082317
  • Luan, Y., Zhao, H., Yang, Z., & Dai, Y. (2019). MSD: Multi-self-distillation learning via multi-classifiers within deep neural networks. arXiv preprint arXiv:1911.09418v3.
  • Ma, L., & Liang, L. (2022). Increasing-margin adversarial (IMA) training to improve adversarial robustness of neural networks. arXiv preprint arXiv:2005.09147v10.
  • Madaan, D., Shin, J., & Hwang, S. J. (2020). Adversarial neural pruning with latent vulnerability suppression. Proceedings of the 37th International Conference on Machine Learning, 6531–6541.
  • Madry, A., Makelov, A., Schmidt, L., Tsipras, D., & Vladu, A. (2018). Towards deep learning models resistant to adversarial attacks. Proceedings of the 6th International Conference on Learning Representation.
  • Mallya, A., & Lazebnik, S. (2018). Packnet: Adding multiple tasks to a single network by iterative pruning. Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 7765–7773.
  • Mirzadeh, S. I., Farajtabar, M., Li, A., & Ghasemzadeh, H. (2020). Improved knowledge distillation via teacher assistant. Proceedings of the 34th AAAI Conference on Artificial Intelligence, 5191–5198. https://doi.org/10.1609/aaai.v34i04.5963
  • Mundt, M., Hong, Y., Pliushch, I., & Ramesh, V. (2023). A wholistic view of continual learning with deep neural networks: Forgotten lessons and the bridge to active and open world learning. Neural Networks, 160, 306–336. https://doi.org/10.1016/j.neunet.2023.01.014
  • Nowak, T. S., & Corso, J. J. (2018). Deep net triage: Analyzing the importance of network layers via structural compression. arXiv preprint arXiv:1801.04651v2.
  • Pfülb, B., & Gepperth, A. (2019). A comprehensive, application-oriented study of catastrophic forgetting in DNNs. Proceedings of the 7th International Conference on Learning Representation.
  • Rice, L., Wong, E., & Kolter, J. Z. (2020). Overfitting in adversarially robust deep learning. Proceedings of the 37th International Conference on Machine Learning, 8093–8104.
  • Romero, A., Ballas, N., Kahou, S. E., Chassang, A., Gatta, C., & Bengio, Y. (2015). Fitnets: Hints for thin deep nets. Proceedings of the 3rd International Conference on Learning Representation.
  • Rusu, A. A., Rabinowitz, N. C., Desjardins, G., Soyer, H., Kirkpatrick, J., Kavukcuoglu, K., Pascanu, R., & Hadsell, R. (2016). Progressive neural networks. arXiv preprint arXiv:1606.04671v3.
  • Schwarz, J., Luketina, J., Czarnecki, W. M., Grabska-Barwinska, A., Teh, Y. W., Pascanu, R., & Hadsell, R. (2018). Progress & compress: A scalable framework for continual learning. Proceedings of the 35th International Conference on Machine Learning, 4535–4544.
  • Simonyan, K., & Zisserman, A. (2015). Very deep convolutional networks for large-scale image recognition. Proceedings of the 3rd International Conference on Learning Representations.
  • Song, H., Kim, M., Park, D., Shin, Y., & Lee, J.-G. (2023). Learning from noisy labels with deep neural networks: A survey. IEEE Trans. on Neural Networks and Learning Systems, 34, 8135–8153. https://doi.org/10.1109/TNNLS.2022.3152527
  • Szegedy, C., Zaremba, W., Sutskever, I., Bruna, J., Erhan, D., Goodfellow, I., & Fergus, R. (2014). Intriguing properties of neural networks. Proceedings of the 2nd International Conference on Learning Representation.
  • Tramèr, F., Kurakin, A., Papernot, N., Goodfellow, I., Boneh, D., & McDaniel, P. (2018). Ensemble adversarial training: Attacks and defenses. Proceedings of the 6th International Conference on Learning Representations.
  • Wang, H., Zhao, H., Li, X., & Tan, X. (2018). Progressive blockwise knowledge distillation for neural network acceleration. Proceedings of the 27th International Joint Conference on Artificial Intelligence, 2769–2775.
  • You, S., Yu, C., Xu, C., & Tao, D. (2017). Learning from multiple teacher networks. Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 1285–1294.
  • Yu, H., Liu, A., Li, G., Yang, J., & Zhang, C. (2021). Progressive diversified augmentation for general robustness of DNNs: A unified approach. IEEE Trans. on Image Processing, 30, 8955–8967. https://doi.org/10.1109/TIP.2021.3121150
  • Yuan, X., He, P., Zhu, Q., & Li, X. (2019). Adversarial examples: Attacks and defenses for deep learning. IEEE Trans. on Neural Networks and Learning Systems, 30, 2805–2824. https://doi.org/10.1109/TNNLS.2018.2886017
  • Zhang, L., Song, J., Gao, A., Chen, J., Bao, C., & Ma, K. (2019a). Be your own teacher: Improve the performance of convolutional neural networks via self-distillation. Proceedings of IEEE/CVF International Conference on Computer Vision, 3713–3722.
  • Zhang, J., Xu, X., Han, B., Niu, G., Cui, L., Sugiyama, M., & Kankanhalli, M. (2020). Attacks which do not kill training make adversarial learning stronger. Proceedings of the 37th International Conference on Machine Learning, 11278–11287.
  • Zhang, H., Yu, Y., Jiao, J., Xing, E. P., Ghaoui, L. E., & Jordan, M. I. (2019b). Theoretically principled trade-off between robustness and accuracy. Proceedings of the 36th International Conference on Machine Learning, 7472–7482.
  • Zhu, M., & Gupta, S. (2018). To prune, or not to prune: Exploring the efficacy of pruning for model compression. Proceedings of International Conference on Learning Representation Workshop Track.