Open Access Open Access  Restricted Access Subscription Access

Evaluation of Ensemble and Deep Learning Classifiers on CSE-CIC-IDS2018 Dataset for Intelligent NIDS

Kaushik Datta, Tapas Samanta, Sarbajit Pal

Abstract


Network Intrusion Detection System (NIDS) plays an active role in preventing cyberattacks by early detection of threats before it really starts affecting targeted information services. Over the years, many intrusion detection system (IDS) have been developed applying signature or rule-based approach to prevent unauthorised access of network or computer devices. However, ever growing landscape of cyberattacks in recent years has motivated present day researchers to design and develop more accurate IDS using modern Machine Learning (ML) methods which identify attacks through anomaly detection. Development of intelligent NIDS highly depends on a rich, up-to-date and contemporary dataset which consists of relevant attributes and real-world scenario of cyberattacks. Varity of datasets are available for this purpose among which KDDCUP99, NSLKDD, ISCX2012, CICIDS2017, CICIDS2018, Kyoto etc. are the most popular ones and widely used. This study reports our observations on the performance of two well-known classifiers among Ensemble Learning methods, namely Random Forest and XGBoost and of Deep Neural Network classifier on the CSE-CIC-IDS2018 dataset which is relatively a new one and covers many contemporary cyberattacks. Their performances are evaluated using multiple metrics including Precision-Recall curve which has been proved to be more useful in case of imbalanced dataset like CSE-CIC-IDS2018.


Keywords


Network intrusion detection system, CSE-CIC-IDS2018 dataset, ensemble learning, multilayer perceptron, random forest, XGBoost, deep neural network

Full Text:

PDF

References


IDS 2018. Datasets: Research: Canadian Institute for Cybersecurity. UNB. 2018. Available from: https://www.unb.ca/cic/datasets/ids-2018.html

Ring Markus, Wunderlich Sarah, Scheuring Deniz, Landes Dieter, Hotho Andreas. A Survey of Network-based Intrusion Detection Data Sets. Comput Secur. 2019; 86: 147–167. 10.1016/j.cose.2019.06.005.

Fitni QRS, Ramli K. Implementation of ensemble learning and feature selection for performance improvements in anomaly-based intrusion detection systems. In Proceedings 2020 IEEE International Conference on Industry 4.0, Artificial Intelligence, and Communications Technology, IAICT 2020. 2020; 118–124.

Karatas G, Demir O, Sahingoz OK. Increasing the Performance of Machine Learning-Based IDSs on an Imbalanced and Up-to-Date Dataset. IEEE Access. 2020; 8: 32150–32162. doi: 10.1109/ACCESS.2020.2973219.

Chawla Nitesh, Bowyer Kevin, Hall Lawrence, Kegelmeyer W. SMOTE: Synthetic Minority Over-sampling Technique. J Artif Intell Res (JAIR). 2002; 16(1): 321–357. 10.1613/jair.953.

Ke G, Meng Q, Finley T, Wang T, Chen W, Ma W, Ye Q, Liu T. LightGBM: A Highly Efficient Gradient Boosting Decision Tree. Proceedings of the 31st International Conference on Neural Information Processing Systems (NIPS’17). 2017 Dec; 3149–3157.

Hua Y. An Efficient Traffic Classification Scheme Using Embedded Feature Selection and LightGBM. 2020 Information Communication Technologies Conference (ICTC), Nanjing, China. 2020; 125–130. doi: 10.1109/ICTC49638.2020.9123302.

Rumelhart D, Hinton G, Williams R. Learning representations by back-propagating errors. Nature. 1986; 323(6088): 533–536.

Basnet Ram, Shash Riad, Johnson Clayton, Walgren Lucas, Doleck Tenzin. Towards Detecting and Classifying Network Intrusion Traffic Using Deep Learning Frameworks. Journal of Internet Services and Information Security (JISIS). 2019; 9(4): 1–17. 10.22667/JISIS.2019.11.30.001.

Filho Francisco, Silveira Frederico, Junior Agostinho, Vargas-Solar Genoveva, Silveira Luiz. Smart Detection: An Online Approach for DoS/DDoS Attack Detection Using Machine Learning. Secur Commun Netw. 2019; 2019: 1574749(15p). 10.1155/2019/1574749.

LeCun Y, Haffner P, Bottou L, Bengio Y. Object Recognition with Gradient-Based Learning. In: Shape, Contour and Grouping in Computer Vision. Lecture Notes in Computer Science. Vol 1681. Berlin, Heidelberg: Springer; 1999.

Kim Jiyeon, Kim Jiwon, Kim Hyunjung, Shim Minsun, Choi Eunjung. CNN-Based Network Intrusion Detection against Denial-of-Service Attacks. Electronics. 2020; 9(6): 916. 10.3390/electronics9060916.

Kanimozhi V, Jacob TP. Artificial intelligence based network intrusion detection with hyper-parameter optimization tuning on the realistic cyber dataset CSE-CIC-IDS2018 using cloud computing. In2019 international conference on communication and signal processing (ICCSP) 2019 Apr 4 (pp. 0033–0036). IEEE.

Sommer R, Paxson V. Outside the Closed World: On Using Machine Learning for Network Intrusion Detection. 2010 IEEE Symposium on Security and Privacy. 2010; 305–316. doi: 10.1109/SP.2010.25.

Applications. Research: Canadian Institute for Cybersecurity, UNB. 2017. Available from: https://www.unb.ca/cic/research/applications.html

Dietterich TG. Ensemble Methods in Machine Learning. In: Multiple Classifier Systems. MCS 2000. Lecture Notes in Computer Science. Vol. 1857. Berlin, Heidelberg: Springer; 2000.

Bühlmann Peter. Bagging, Boosting and Ensemble Methods. In: Handbook of Computational Statistics. Berlin, Heidelberg: Springer; 2012. 10.1007/978-3-642-21551-3_33.

Tin Kam Ho. Random decision forests. Proceedings of 3rd International Conference on Document Analysis and Recognition. 1995; 1: 278–282. doi: 10.1109/ICDAR.1995.598994.

Chen TQ, Guestrin C. XGBoost: A Scalable Tree Boosting System. arXiv:1603.02754v3. 2016. 20. LeCun Yann, Bengio Y, Hinton Geoffrey. Deep Learning. Nature. 2015; 521(7553): 436–44. 10.1038/nature14539.

Grandini M, Bagli E, Visani G. Metrics for multi-class classification: an overview. arXiv preprint arXiv:2008.05756. 2020 Aug 13.

Rashmi K, Ran Gilad-Bachrach. DART: Dropouts meet Multiple Additive Regression Trees. ArXiv abs/1505.01866. 2015.


Refbacks

  • There are currently no refbacks.


Copyright (c) 2023 Current Trends in Information Technology

  • eISSN: 2249-4707
  • ISSN: 2348-7895