1,187
Views
11
CrossRef citations to date
0
Altmetric
Articles

A survey of intrusion detection from the perspective of intrusion datasets and machine learning techniques

ORCID Icon & ORCID Icon
Pages 659-669 | Received 24 Jul 2020, Accepted 30 Jan 2021, Published online: 14 Feb 2021

References

  • Di Pietro R, Mancini LV. Intrusion detection systems (Vol. 38). New York:  Springer Science & Business Media; 2008.
  • Gu G, Fogla P, Dagon D, et al. Measuring intrusion detection capability: an information-theoretic approach. In: Ferngching Lin, editor. Proceedings of the 2006 ACM symposium on information, computer and communications security; New York: Association for Computing Machinery; 2006 Mar. p. 90–101.
  • Axelsson S. The base-rate fallacy and the difficulty of intrusion detection. ACM Trans Inf Syst Secur (TISSEC). 2000;3(3):186–205.
  • Gartner Survey. Newsroom. 2019. [cited 2019 Jan 23]. Available from: https://www.gartner.com/en/newsroom/press-releases/2019-01-23-gartner-survey-finds-government-cios-to-focus-technol.
  • Hoque N, Bhuyan MH, Baishya RC, et al. Network attacks: taxonomy, tools and systems. J Netw Comput Appl. 2014;40:307–324.
  • Buczak AL, Guven E. A survey of data mining and machine learning methods for cyber security intrusion detection. IEEE Commun Surv Tutorials. 2015;18(2):1153–1176.
  • Ahmed M, Mahmood AN, Hu J. A survey of network anomaly detection techniques. J Netw Comput Appl. 2016;60:19–31.
  • Ring M, Wunderlich S, Scheuring D, et al. A survey of network-based intrusion detection data sets. Comput Secur. 2019;86:147–167.
  • Divekar A, Parekh M, Savla V, et al. Benchmarking datasets for anomaly-based network intrusion detection: KDD CUP 99 alternatives. 2018 IEEE 3rd International Conference on Computing, Communication and Security (ICCCS); 2018 October; IEEE. p. 1–8.
  • Lippmann R, Haines JW, Fried DJ, et al. The 1999 DARPA off-line intrusion detection evaluation. Comput Netw. 2000;34(4):579–595.
  • Elkan C. Results of the KDD’99 classifier learning contest. Sponsored by the International Conference on Knowledge Discovery in Databases; 1999 September.
  • Engen V, Vincent J, Phalp K. Exploring discrepancies in findings obtained with the KDD Cup'99 data set. Intell Data Anal. 2011;15(2):251–276.
  • Tavallaee M, Bagheri E, Lu W, et al. A detailed analysis of the KDD CUP 99 data set. 2009 IEEE symposium on computational intelligence for security and defense applications; 2009 July; IEEE. p. 1–6.
  • McHugh J. Testing intrusion detection systems: a critique of the 1998 and 1999 darpa intrusion detection system evaluations as performed by Lincoln laboratory. ACM Trans Inf Syst Secur (TISSEC). 2000;3(4):262–294.
  • Devan P, Khare N. An efficient XGBoost–DNN-based classification model for network intrusion detection system. Neural Comput Appl. 2020;32:1–16.
  • Khare N, Devan P, Chowdhary CL, et al. SMO-DNN: spider monkey optimization and deep neural network hybrid classifier model for intrusion detection. Electronics (Basel). 2020;9(4). https://doi.org/https://doi.org/10.3390/electronics9040692.
  • Kolias C, Kambourakis G, Stavrou A, et al. Intrusion detection in 802.11 networks: empirical evaluation of threats and a public dataset. IEEE Commun Surv Tutorials. 2015;18(1):184–208.
  • Laptev N, Amizadeh S. Yahoo anomaly detection dataset s5. 2015. Available from: http://webscope. sandbox. yahoo. com/catalog. php.
  • Lavin A, Ahmad S. Evaluating real-time anomaly detection algorithms–the numenta anomaly benchmark. 2015 IEEE 14th International Conference on Machine Learning and Applications (ICMLA); 2015 December; IEEE. p. 38–44.
  • Singh N, Olinsky C. Demystifying Numenta anomaly benchmark. 2017 International Joint Conference on Neural Networks (IJCNN); 2017 May; IEEE. p. 1570–1577.
  • Song J, Takakura H, Okabe Y. Description of kyoto university benchmark data. 2006. [cited 2016 Mar 15]. Available from: http://www.takakura.com/Kyoto_data/BenchmarkData-Description-v5.pdf.
  • Song J, Takakura H, Okabe Y, et al. Statistical analysis of honeypot data and building of Kyoto 2006+ dataset for NIDS evaluation. Proceedings of the First Workshop on Building Analysis Datasets and Gathering Experience Returns for Security; 2011 April. p. 29–36.
  • Moustafa N, Slay J. UNSW-NB15: a comprehensive data set for network intrusion detection systems (UNSW-NB15 network data set). In: Military communications and information systems conference (MilCIS), 2015. New York: IEEE; 2015. p. 1–6.
  • Moustafa N, Slay J. The evaluation of network anomaly detection systems: statistical analysis of the UNSW-NB15 data set and the comparison with the KDD99 data set. Inf Secur J: A Global Perspect. 2016;25(1-3):18–31.
  • Koroniotis N, Moustafa N, Sitnikova E, et al. Towards the development of realistic botnet dataset in the internet of things for network forensic analytics: Bot-iot dataset. Future Gener Comput Syst. 2019;100:779–796.
  • Taheri R, Ghahramani M, Javidan R, et al. Similarity-based Android malware detection using hamming distance of static binary features. Future Gener Comput Syst. 2020;105:230–247.
  • Zhou Y, Jiang X. Dissecting android malware: characterization and evolution. 2012 IEEE symposium on security and privacy; 2012, May; IEEE. p. 95–109.
  • Arp D, Spreitzenbarth M, Hubner M, et al. Drebin: effective and explainable detection of android malware in your pocket. In: Ndss. Vol. 14.. San Diego: Internet Society; 2014 Feb. p. 23–26.
  • Contagio Dataset. 2020. [ctied 2020 Oct 21] Available from: http://contagiominidump.blogspot.com/. https://www.sec.cs.tu-bs.de/∼danarp/drebin/.
  • Ghahramani Z. Unsupervised learning. In: Bousquet O, Von Luxburg, Rätsch G, editors. Summer school on machine learning. Berlin, Heidelberg: Springer; 2003 Feb. p. 72–112.
  • Safavian SR, Landgrebe D. A survey of decision tree classifier methodology. IEEE Trans Syst Man Cybern. 1991;21(3):660–674.
  • Breiman L. Random forests. Mach Learn. 2001;45(1):5–32.
  • Dietterich T. Overfitting and undercomputing in machine learning. ACM Comput Surv (CSUR). 1995;27(3):326–327.
  • Deng L, Yu D. Deep learning: methods and applications. Found Trends® Signal Process. 2014;7(3–4):197–387.
  • Yassin W, Udzir NI, Muda Z, et al. A cloud-based intrusion detection service framework. Proceedings Title: 2012 International Conference on Cyber Security, Cyber Warfare and Digital Forensic (CyberSec); 2012 June; IEEE. p. 213–218.
  • Vapnik VN. An overview of statistical learning theory. IEEE Trans Neural Netw. 1999;10(5):988–999.
  • Akbani R, Kwek S, Japkowicz N. Applying support vector machines to imbalanced datasets. In: Boulicaut JF, Esposito F, Giannotti F, et al., editors. European conference on machine learning. Berlin, Heidelberg: Springer; 2004 Sept. p. 39–50.
  • Hawkins J, Ahmad S, Dubinsky D. Hierarchical temporal memory including HTM cortical learning algorithms. Techical report. Palto Alto: Numenta, Inc; 2010. Available from: http://www. numenta. com/htmoverview/education/HTM_CorticalLearningAlgorithms. pdf.
  • Twitter ADVec. [cited 2017 Jan 20]. Available from: https://github.com/twitter/AnomalyDetection.
  • Stanway A. etsy/skyline [Online code repository]. 2013. Available from: https://github.com/etsy/skyline.
  • Thill M, Konen W, Bäck T. Online anomaly detection on the webscope S5 dataset: A comparative study. In: Igor Skrjanc, Saso Blazic, editors. Evolving and adaptive intelligent systems (EAIS), 2017. Ljubljana: IEEE; 2017 May. p. 1–8.
  • Lopez-Martin M, Carro B, Sanchez-Esguevillas A. Application of deep reinforcement learning to intrusion detection for supervised problems. Expert Syst Appl. 2020;141:112963.
  • Mishra S, Sagban R, Yakoob A, et al. Swarm intelligence in anomaly detection systems: an overview. Int J Comput Appl. 2018: 1–10.
  • Reddy GT, Khare N. FFBAT-optimized rule based fuzzy logic classifier for diabetes. Int J Eng Res Afr. 2016;24:137–152. Trans Tech Publications Ltd.
  • Reddy GT, Khare N. Hybrid firefly-bat optimized fuzzy artificial neural network based classifier for diabetes diagnosis. Int J Intell Eng Syst. 2017a;10(4):18–27.
  • Reddy GT, Khare N. Cuckoo search optimized reduction and fuzzy logic classifier for heart disease and diabetes prediction. Int J Fuzzy Syst Appl (IJFSA). 2017b;6(2):25–42.
  • Reddy GT, Khare N. An efficient system for heart disease prediction using hybrid OFBAT with rule-based fuzzy logic model. J Circuits Syst Comput. 2017c;26(04):1750061.
  • Tama BA, Comuzzi M, Rhee KH. TSE-IDS: A two-stage classifier ensemble for intelligent anomaly-based intrusion detection system. IEEE Access. 2019;7:94497–94507.
  • Hamid Y, Sugumaran M. A t-SNE based non linear dimension reduction for network intrusion detection. Int J Inf Technol. 2020;12(1):125–134.
  • Khan FA, Gumaei A, Derhab A, et al. A novel two-stage deep learning model for efficient network intrusion detection. IEEE Access. 2019;7:30373–30385.
  • Yang Y, Zheng K, Wu C, et al. Improving the classification effectiveness of intrusion detection by using improved conditional variational autoencoder and deep neural network. Sensors. 2019;19(11):2528.
  • Iwendi C, Khan S, Anajemba JH, et al. The use of ensemble models for multiple class and binary class classification for improving intrusion detection systems. Sensors. 2020;20(9):2559.
  • Vasilomanolakis E, Karuppayah S, Mühlhäuser M, et al. Taxonomy and survey of collaborative intrusion detection. ACM Comput Surv (CSUR). 2015;47(4):1–33.
  • Djenouri D, Khelladi L, Badache AN. A survey of security issues in mobile ad hoc and sensor networks. In: Dusit Niyato, editor. IEEE Communications surveys Tutorials. Vol. 7, No. 4Singapore: IEEE Communications Society; 2005. p. 2–28.
  • Pathan ASK, ed. Security of self-organizing networks: MANET, WSN, WMN, VANET. CRC press; 2016.
  • Zhang T, Zhu Q. Distributed privacy-preserving collaborative intrusion detection systems for VANETs. IEEE Trans Signal Inf Process Over Netws. 2018;4(1):148–161.
  • Gao Y, Wu H, Song B, et al. A distributed network intrusion detection system for distributed Denial of service attacks in vehicular Ad Hoc network. IEEE Access. 2019;7:154560–154571.
  • Hassine K, Erbad A, Hamila R. Important complexity reduction of random forest in multi-classification problem. 2019 15th International Wireless Communications & Mobile Computing Conference (IWCMC); 2019 June; IEEE. p. 226–231.
  • Zhang W, Han D, Li KC, et al. Wireless sensor network intrusion detection system based on MK-ELM. Soft comput. 2020;24:1–14.
  • Moustafa N, Turnbull B, Choo KKR. An ensemble intrusion detection technique based on proposed statistical flow features for protecting network traffic of internet of things. IEEE Internet Things J. 2018;6(3):4815–4830.
  • Alkadi O, Moustafa N, Turnbull B, et al. A deep blockchain framework-enabled collaborative intrusion detection For protecting IoT and cloud networks. IEEE Internet Things J. 2020.
  • Koroniotis N, Moustafa N, Sitnikova E. A new network forensic framework based on deep learning for internet of Things networks: A particle deep framework. Future Gener Comput Syst. 2020;110:91–106.
  • Choudhary S, Kesswani N. Analysis of KDD-Cup’99, NSL-KDD and UNSW-NB15 datasets using deep learning in IoT. Procedia Comput Sci. 2020;167:1561–1573.
  • Hajisalem V, Babaie S. A hybrid intrusion detection system based on ABC-AFS algorithm for misuse and anomaly detection. Comput Netw. 2018;136:37–50.
  • Lee W, Stolfo SJ, Mok KW. A data mining framework for building intrusion detection models. Proceedings of the 1999 IEEE Symposium on Security and Privacy (Cat. No. 99CB36344); 1999, May; IEEE. p. 120–132.
  • Tombini E, Debar H, Mé L, et al. A serial combination of anomaly and misuse IDSes applied to HTTP traffic. 20th Annual Computer Security Applications Conference; 2004 December; IEEE. p. 428–437.
  • Salman T, Bhamare D, Erbad A, et al. Machine learning for anomaly detection and categorization in multi-cloud environments. 2017 IEEE 4th International Conference on Cyber Security and Cloud Computing (CSCloud); 2017, June. IEEE. p. 97–103.
  • Patel H, Singh Rajput D, Thippa Reddy G, et al. A review on classification of imbalanced data for wireless sensor networks. Int J Distrib Sens Netw. 2020;16(4):155014772091640.
  • Tan Z, Nagar UT, He X, et al. Enhancing big data security with collaborative intrusion detection. IEEE Cloud Computing. 2014;1(3):27–33.
  • Wang Z. Deep learning-based intrusion detection with adversaries. IEEE Access. 2018;6:38367–38384.
  • Taheri R, Javidan R, Pooranian Z. Adversarial android malware detection for mobile multimedia applications in IoT environments. Multimed Tools Appl. 2020: 1–17.
  • Iwendi C, Jalil Z, Javed AR, et al. Keysplitwatermark: zero watermarking algorithm for software protection against cyber-attacks. IEEE Access. 2020;8:72650–72660.
  • Bär A, Finamore A, Casas P, et al. Large-scale network traffic monitoring with DBStream, a system for rolling big data analysis. 2014 IEEE International Conference on Big Data (Big Data); 2014, October; IEEE. p. 165–170.
  • Habeeb RAA, Nasaruddin F, Gani A, et al. Real-time big data processing for anomaly detection: a survey. Int J Inf Manage. 2019;45:289–307.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.