8,610
Views
5
CrossRef citations to date
0
Altmetric
Research Article

BERT-Log: Anomaly Detection for System Logs Based on Pre-trained Language Model

& ORCID Icon
Article: 2145642 | Received 16 Aug 2022, Accepted 04 Nov 2022, Published online: 17 Nov 2022

References

  • Aussel, N., Y. Petetin, and S. Chabridon. 2018. Improving Performances of Log Mining for Anomaly Prediction through NLP-based Log Parsing. In Proceedings of 26th IEEE International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems, 237–23. Milwaukee.
  • Bretan, P. 2017. Trap analysis: An automated approach for deriving column height predictions in fault-bounded traps. Petroleum Geoscience 23 (1):56–69. doi:10.1144/10.44petgeo2016-022.
  • Chen, L. J., J. Ren, P. F. Chen, X. Mao, and Q. Zhao. 2022. Limited text speech synthesis with electroglottograph based on Bi-LSTM and modified Tacotron-2. Applied Intelligence. doi:10.1007/s10489-021-03075-x.
  • Cherkasova, L., K. Ozonat, N. F. Mi, J. Symons, and E. Smirni. 2009. Automated anomaly detection and performance modeling of enterprise applications. ACM Transactions on Computer Systems 27 (3):1–32. doi:10.1145/1629087.1629089.
  • Dai, H. T., H. Li, C. S. Chen, W. Y. Shang, and T. H. Chen. 2022. Logram: Efficient log parsing using n-Gram dictionaries. IEEE Transactions on Software Engineering 48 (3):879–92. doi:10.1109/TSE.2020.3007554.
  • Devlin, J., M. W. Chang, K. Lee, and K. Toutanova. 2018. BERT: Pre-training of deep bidirectional transformers for language understanding. arXiv preprin arXiv:1810.04805, Oct 11.
  • Do, P., and T. H. V. Phan. 2021. Developing a BERT based triple classification model using knowledge graph embedding for question answering system. Applied Intelligence 52 (1):636–51. doi:10.1007/s10489-021-02460-w.
  • Du, M., and F. F. Li. 2016. Spell: Streaming parsing of system event logs. In Proceedings of the 16th IEEE International Conference on Data Mining, 859–64. Barcelona.
  • Du, M., F. F. Li, G. N. Zheng, and V. Srikumar. 2017. Anomaly detection and diagnosis from system logs through deep learning. In Proceedings of the 24th ACM-SIGSAC Conference on Computer and Communications Security, 1285–98. Dallas.
  • Greff, K., R. K. Srivastava, J. Koutnik, B. R. Steunebrink, and J. Schmidhuber. 2017. LSTM: A search space Odyssey. IEEE Transactions on Neural Networks and Learning Systems 28 (10):2222–32. doi:10.1109/TNNLS.2016.2582924.
  • Guo, H. X., S. H. Yuan, and X. T. Wu. 2021. LogBERT: Log Anomaly Detection via BERT. In Proceedings of the IEEE International Joint Conference on Neural Networks, Shenzhen. Electr Network.
  • He, P. J., J. M. Zhu, S. L. He, J. Li, and M. R. Lyu. 2018. Towards automated log parsing for large-scale log data analysis. IEEE Transactions on Dependable and Secure Computing 15 (6):931–44. doi:10.1109/TDSC.2017.2762673.
  • He, S. L., J. M. Zhu, P. J. He, and M. R. Lyu. 2016. Experience report: System log analysis for anomaly detection. In Proceedings of the 27th International Symposium on Software Reliability Engineering, 207–18. Ottawa.
  • He, P. J., J. M. Zhu, Z. B. Zheng, and M. R. Lyu. 2017. Drain: An online log parsing approach with fixed depth tree. In Proceedings of the 24th IEEE International Conference on Web Services, 33–40. Honolulu.
  • Hooshmand, M. K., and D. Hosahalli. 2022. Network anomaly detection using deep learning techniques. CAAI Transactions on Intelligence Technology 7 (2):228–43. doi:10.1049/cit2.12078.
  • Huang, S. H., Y. Liu, C. Fung, R. He, Y. Zhao, H. Yang, and Z. Luan. 2020. HitAnomaly: Hierarchical transformers for anomaly detection in system log. IEEE Transactions on Network and Service Management 17 (4):2064–76. doi:10.1109/TNSM.2020.3034647.
  • Hu, J., Y. J. Zhang, M. H. Zhao, and P. Li. 2022. Spatial-spectral extraction for hyperspectral anomaly detection. IEEE Geoscience and Remote Sensing Letters 19:19. doi:10.1109/LGRS.2021.3130908.
  • Ito, K., H. Hasegawa, Y. Yamaguchi, and H. Shimada. 2018. Detecting privacy information abuse by android apps from API call logs. Lecture Notes in Artificial Intelligence 11049:143–57. doi:10.1007/978-3-319-97916-8_10.
  • Jukic, O., I. Hedi, and A. Sarabok. 2019. Fault management API for SNMP agents. In Proceedings of the 42nd International Convention on Information and Communication Technology, Electronics and Microelectronics, 431–34. Opatija.
  • Lee, Y., J. Kim, and P. Kang. 2021. LAnoBERT : System log anomaly detection based on BERT masked language model. arXiv preprint arXiv:2111.09564, November 18.
  • Le, V. H., and H. Y. Zhang. 2021. Log-based anomaly detection without log parsing. In Proceedings of the 36th IEEE/ACM International Conference on Automated Software Engineering, Australia, 492–504. Electr Network.
  • Liu, C. B., L. L. Pan, Z. J. Gu, J. Wang, Y. Ren, and Z. Wang. 2020. Valid probabilistic anomaly detection models for system logs. Wireless Communications & Mobile Computing. doi:10.1155/2020/8827185.
  • Lou, J. G., Q. Fu, S. Q. Yang, Y. Xu, and J. Li. 2010. Mining invariants from console logs for system problem detection. In Proceedings of the 2010 USENIX Annual Technical Conference, Boston, 231–44.
  • Lv, D., N. Luktarhan, and Y. Y. Chen. 2021. ConAnomaly: Content-based anomaly detection for system logs. Sensors 21 (18):6125. doi:10.3390/s21186125.
  • Maeyens, J., A. Vorstermans, and M. Verbeke. 2020. Process mining on machine event logs for profiling abnormal behaviour and root cause analysis. Annals of Telecommunications 75 (9–10):563–72. doi:10.1007/s12243-020-00809-9.
  • Makanju, A., A. N. Zincir-Heywood, and E. E. Milios. 2009. Clustering Event Logs Using Iterative Partitioning. In Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 1255–63. Paris.
  • Mi, H. B., H. M. Wang, Y. F. Zhou, M. R. Lyu, and H. Cai. 2012. Localizing root causes of performance anomalies in cloud computing systems by analyzing request trace logs. Science China-Information Sciences 55 (12):2757–73. doi:10.1007/s11432-012-4747-8.
  • Mi, H. B., H. M. Wang, Y. F. Zhou, M. R. T. Lyu, and H. Cai. 2013. Toward fine-grained, unsupervised. Scalable Performance Diagnosis for Production Cloud Computing Systems 24 (6):1245–55. doi:10.1109/TPDS.2013.21.
  • Oliner, A., and J. Stearley. 2007. What supercomputers say: A study of five system logs. In Proceedings of the 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks, 575±. Edinburgh.
  • Peng, Y. Q., T. F. Xiao, and H. T. Yuan. 2022. Cooperative gating network based on a single BERT encoder for aspect term sentiment analysis. Applied Intelligence 52 (5):5867–79. doi:10.1007/s10489-021-02724-5.
  • Setia, S., V. Jyoti, and N. Duhan. 2020. HPM: A hybrid model for user’s behavior prediction based on N-Gram parsing and access logs. Scientific Programming. doi:10.1155/2020/8897244.
  • Studiawan, H., F. Sohel, and C. Payne. 2021. Anomaly detection in operating system logs with deep learning-based sentiment analysis. IEEE Transactions on Dependable and Secure Computing 18 (5):2136–48. doi:10.1109/TDSC.2020.3037903.
  • Tang, L., T. Li, and C. S. Perng. 2011. LogSig: Generating system events from raw textual logs. In Proceedings of the 2011 ACM International Conference on Information and Knowledge Management, Glasgow, 785–94.
  • Tufek, A., and M. S. Aktas. 2021. On the provenance extraction techniques from large scale log files. Concurrency and Computation-Practice & Experience. doi:10.1002/cpe.6559.
  • Vaswani, A., N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, et al. 2017. Attention Is All You Need. Proceedings of the 31st Annual Conference on Neural Information Processing Systems, Long Beach.
  • Wang, J., C. Q. Zhao, S. M. He, Y. Gu, O. Alfarraj, and A. Abugabah. 2021. LogUAD: Log unsupervised anomaly detection based on Word2Vec. Computer Systems Science and Engineering 41 (3):1207–22. doi:10.32604/csse.2022.022365.
  • Wittkopp, T., A. Acker, S. Nedelkoski, J. Bogatinovski, D. Scheinert, et al. 2021. A2Log: Attentive Augmented Log Anomaly Detection. arXiv preprint arXiv:2109.09537, Sep 20.
  • Xu, L., L. Huang, A. Fox, D. Patterson, and M. I. Jordan. 2009. Detecting large-scale system problems by mining console logs. In Proceedings of the Twenty-second ACM SIGOPS Symposium on Operating Systems Principles, 117–32. Big Sky.
  • Yen, T. F., A. Oprea, K. Onarlioglu, T. Leetham, W. Robertson, et al. 2013. Beehive: Large-scale log analysis for detecting suspicious activity in enterprise networks. Proceedings of the 29th Annual Computer Security Applications Conference, 199–208, New Orleans.
  • Zhang, Y. Y., and A. Sivasubramaniam. 2008. Failure prediction in IBM BlueGene/L event logs. In Proceedings of the 22nd IEEE International Parallel and Distributed Processing Symposium, 2525±. Miami.
  • Zhang, L., X. S. Xie, K. P. Xie, Z. Wang, Y. Yu, et al. 2019b. An Efficient Log Parsing Algorithm Based on Heuristic Rules. Proceeding of the 13th International Symposium on Advanced Parallel Processing Technologies, 123–134, Tianjin.
  • Zhang, X., Y. Xu, Q. W. Lin, B. Qiao, H. Y. Zhang, et al. 2019a. Robust Log-Based Anomaly Detection on Unstable Log Data. Proceedings of the 27th ACM Joint Meeting on European Software Engineering Conference (ESEC) / Symposium on the Foundations of Software Engineering, 807–817, Tallinn.
  • Zhao, Z. F., W. N. Niu, X. S. Zhang, R. Zhang, Z. Yu, and C. Huang. 2021. Trine: Syslog anomaly detection with three transformer encoders in one generative adversarial network. Applied Intelligence 52 (8):8810–19. doi:10.1007/s10489-021-02863-9.
  • Zhong, Y., Y. B. Guo, and C. H. Liu. 2018. FLP: A feature-based method for log parsing. Electronics letters 54 (23):1334–35. doi:10.1049/el.2018.6079.
  • Zhu, J. M., S. L. He, J. Y. Liu, P. J. He, Q. Xie, et al. 2019. Tools and Benchmarks for Automated Log Parsing. Proceedings of the 41st International Conference on Software Engineering - Software Engineering in Practice, 121–130, Montreal.
  • Zhu, Y., W. B. Meng, Y. Liu, S. Zhang, T. Han, et al. 2021. UniLog: Deploy One Model and Specialize it for All Log Analysis Tasks. arXiv preprin arXiv:2112.03159, Dec 6.