Optimizing Machine Learning-Based Network Intrusion Detection System with Oversampling, Feature Selection and Extraction

Authors

  • Rama Wijaya Shiddiq Telkom University
  • Nyoman Karna Telkom University
  • Indrarini Dyah Irawati Telkom University

DOI:

https://doi.org/10.26555/jiteki.v11i2.30675

Keywords:

Machine Learning, Network Intrusion Detection System, Imbalanced Dataset Handling in NIDS, Feature Selection and Extraction, Optuna

Abstract

Network security is a global challenge that requires intelligent and efficient solutions. Machine Learning (ML)-based Network Intrusion Detection Systems (NIDS) have been proven to enhance accuracy in detecting cyberattacks. However, the main challenges in implementing ML-based IDS are dataset imbalance and large dataset size. This research addresses these challenges by applying oversampling techniques to balance the dataset, feature selection using random forest to identify the most relevant features, and feature extraction using Principal Component Analysis (PCA) to further reduce the selected important features. Additionally, K-fold cross-validation is used to test the features to minimize bias and ensure the model does not suffer from overfitting, while Optuna is implemented to automatically optimize model parameters for maximum accuracy. Since IDS performance deteriorates with high-dimensional features, the combination of methods used is evaluated based on feature selection applied to the model using datasets wtih 45 features selected from UNSW-NB15, 78 features from CIC-IDS-2017, and 80 features from CIC-IDS-2018 using various ML algorithms. The results demonstrate that the combination technique with feature selection, along with maximum optimization for each model significantly improves performance on large and imbalanced datasets reaching 99% accuracy compared to conventional methods in network traffic analysis.

References

[1] M. Sarhan, S. Layeghy, N. Moustafa, M. Gallagher, and M. Portmann, “Feature extraction for machine learning- based intrusion detection in IoT networks,” Digital Communications and Networks, vol. 10, no. 1, pp. 205–216, Feb. 2024. https://doi.org/10.1016/j.dcan.2022.08.012.

[2] Y. Kim and J. Kim, “CNN-LSTM: Hybrid Deep Neural Network for Network Intrusion Detection System,” in IEEE Transactions on Network and Service Management, vol. 19, no. 4, pp. 5125–5138, Dec. 2023, https://doi.org/ 10.1109/9889698.

[3] R. Kumar and A. K. Singh, “A Deep Learning Approach to Network Intrusion Detection,” in IEEE Access, vol. 30, no. 3, pp. 7856–7868, May 2021, https://doi.org/10.1109/8264962.

[4] S. K. Sharma, V. S. Kushwaha, and T. H. Kim, “IoT Intrusion Detection System Using Deep Learning and Enhanced Transient Search Optimization,” in IEEE Internet of Things Journal, vol. 9, no. 5, pp. 11248–11260, Sept. 2022, https://doi.org/10.1109/9525369.

[5] A. R. Rashed and W. A. Rizk, “Machine Learning-Powered Encrypted Network Traffic Analysis: A Comprehensive Survey,” in IEEE Transactions on Information Forensics and Security, vol. 18, no. 2, pp. 312–329, Feb. 2024, https://doi.org/10.1109/9896143.

[6] M. W. Oh, P. S. Kim, and J. K. Noh, "Ensemble Learning Approach for Network Intrusion Detection using Hybrid Feature Extraction," IEEE Access, vol. 10, pp. 157395-157406, Apr. 2022, https://doi.org/10.1109/ACCESS.2022.3195505.

[7] H. Zhao, X. Sun, and Y. Liu, “A Deep Reinforcement Learning Approach for Anomaly Network Intrusion Detection

System,” in Proceedings of the IEEE International Conference on Communications (ICC), pp. 102–108, Jan. 2023, https://doi.org/10.1109/9335796.

[8] R. A. Disha and S. Waheed, "Performance analysis of machine learning models for intrusion detection system using Gini Impurity-based Weighted Random Forest (GIWRF) feature selection technique" Spring Open Access Journal, vol. 5, no. 1, pp. 1-22, 2022. https://doi.org/10.1186/s42400-021-00103-8.

[9] M. S. El-Masri, E. E. El-Alfy, and A. A. M. Sayad, "A Hybrid Machine Learning Framework for Intrusion Detection in IoT Systems," in Proceedings of the IEEE International Conference on Industrial Informatics (INDIN), pp. 12-17, 2022, https://doi.org/10.1109/INDIN49073.2022.9622198.

[10] Y. Zhang, C. Zhang, and X. Wang, "A Review of Machine Learning Methods for Network Intrusion Detection Systems," IEEE Transactions on Neural Networks and Learning Systems, vol. 34, no. 2, pp. 1183-1195, Feb. 2023, https://doi.org/10.1109/TNNLS.2022.3204521.

[11] M. A. M. Hossain, M. R. S. S. Sayeed, and M. M. Rahman, “Deep Learning-based Intrusion Detection System for Secure IoT Networks,” in IEEE International Conference on Smart Computing (SMARTCOMP), pp. 208-214, 2022, https://doi.org/10.1109/SMARTCOMP53562.2022.9782321.

[12] F. Hussain et al., “Machine learning in iot security: current solutions and future challenges,” IEEE Commun Surv Tutor, vol. 22, no. 3,pp. 1686–1721, 2020, https://doi.org/10.1109/COMST.2020.2986444.

[13] M. Talukder et al., “Machine learning-based network intrusion detection for big and imbalanced data using oversampling, stacking feature embedding and feature extraction,” Journal of Big Data, vol. 11, 2024, https://doi.org/10.1186/s40537-024-00886-w.

[14] P. L. S. Jayalaxmi, R. Saha, G. Kumar, M. Conti, T. H. Kim, “Machine and Deep Learning Solutions for Intrusion Detection and Prevention in IoTs: A Survey,” IEEE Access, vol. 10, pp. 121173-121192, 2022, https://doi.org/10.1109/ACCESS.2022.3220622.

[15] S. Khan, F. R. Khan, and N. Anwar, "Machine Learning-based Intrusion Detection System for IoT: A Survey and Comparative Study," in IEEE Access, vol. 9, pp. 95073-95087, Jul. 2021. https://doi.org/10.1109/ACCESS.2021.3099298.

[16] L. Zhang, Y. Wang, and S. Liu, "A Deep Learning Approach for Network Intrusion Detection System using Recurrent Neural Networks," IEEE Access, vol. 9, pp. 154693-154702, Oct. 2021. https://doi.org/10.1109/ACCESS.2021.3115683.

[17] W. Liu, M. H. Anwar, and H. T. H. Nguyen, “Real-time Network Intrusion Detection using Convolutional Neural Networks,” in IEEE Transactions on Network and Service Management, vol. 18, no. 5, pp. 4629-4639, May 2022. https://doi.org/10.1109/TNSM.2022.3153501.

[18] R. K. Gupta, S. D. Bhatti, and K. S. Bawa, "A Comprehensive Survey on Feature Selection for Intrusion Detection Systems," in IEEE Transactions on Information Forensics and Security, vol. 17, no. 3, pp. 196-208, Mar. 2022. https://doi.org/10.1109/TIFS.2021.3071781.

[19] N. K. Meena, M. S. Soni, and S. A. Rathore, "Deep Neural Networks-based Network Intrusion Detection System for High-Dimensional Traffic Data," IEEE Access, vol. 9, pp. 105872-105881, Jun. 2021.

[20] A. Zarei, A. Mozaffari, and M. S. M. Sajadi, “Deep Learning Methods for Network Traffic Analysis and Intrusion Detection Systems,” IEEE Transactions on Cybernetics, vol. 53, no. 8, pp. 3921-3931, Aug. 2023. https://doi.org/10.1109/TCYB.2022.3205680.

[21] Z. H. U. Guowei et al., “Research on network intrusion detection method of power system based on random forest algorithm.” 13th international conference on Measuriing Technology and Mechatronics Automation (ICMTMA). p. 374-379, 2021, https://doi.org/10.1109/ICMTMA52658.2021.00087.

[22] S. W. Kim, H. S. Kim, and B. D. Lee, "An Effective Intrusion Detection System Using Ensemble Learning for Network Traffic Classification," in Proceedings of the IEEE International Conference on Communications (ICC), pp. 1031-1036, 2022, https://doi.org/10.1109/ICC45636.2022.9836852.

[23] K. R. H. S. Gummadi, Y. K. S. Reddy, and R. H. Raj, “Machine Learning-Based Classification of Malicious Network Traffic,” in IEEE Transactions on Systems, Man, and Cybernetics: Systems, vol. 50, no. 5, pp. 1701-1709, May 2022. https://doi.org/10.1109/TSMC.2022.3208749.

[24] A. S. L. Xie, T. C. Y. Chan, and Z. S. Lin, “Comparing Deep Learning Approaches for NIDS on Cloud Networks,” in Proceedings of the IEEE International Symposium on Cloud Computing and Big Data, pp. 121-126, 2021, https://doi.org/10.1109/ISCCBD53030.2021.9745523.

[25] S. Moualla, K. Khorzo, A. Jafar, “Improving the performance of machine learning-based network intrusion detection systems on the UNSW-NB15 dataset,” Comput Intel Neurosci. pp. 1–13, 2021, https://doi.org/10.1155/2021/5557577.

[26] S. M. Kasongo, Y. Sun, “Performance analysis of intrusion detection systems using a feature selection method on the unsw-nb15 dataset,” J Big Data. vol. 7, no. 1, pp. 1–20, 2020, https://doi.org/10.1186/s40537-020-00379-6.

[27] P. Nimbalkar, D. Kshirsagar, “Feature selection for intrusion detection system in internet-of-things (IOT),” ICT Express, vol. 7, no. 2, pp. 177–181, 2021, https://doi.org/10.1016/j.icte.2021.04.012.

[28] P. S. Hwang, Y. D. Lee, and T. W. Choi, "Evaluating Machine Learning Algorithms for Real-Time Intrusion Detection," in Proceedings of the IEEE International Conference on Computational Intelligence (ICCI), pp. 1158-1163, 2021, https://doi.org/10.1109/ICCI53462.2021.00022.

[29] M. Ahmad et al., “Intrusion detection in internet of things using supervised machine learning based on application and transport layer features using unsw-nb15 data-set,” J Wirel Commun Netw. pp. 1-23, 2021, https://doi.org/10.1186/s13638-021-01893-8.

[30] D. Kshirsagar, S. Kumar, “An efficient feature reduction method for the detection of DoS attack,” ICT Express. vol. 7, no. 3, pp. 371–375, 2021, https://doi.org/10.1016/j.icte.2020.12.006.

[31] E. Mugabo et al., “Intrusion detection method based on mapreduce for evolutionary feature selection in mobile cloud computing,” Int J Netw Secur, vol. 23, no. 1, pp. 106–115, 2021,

https://doi.org/10.6633/IJNS.202101_23(1).13.

[32] A. Talita, O. Nataza, Z. Rustam, “Naïve bayes classifier and particle swarm optimization feature selection method for classifying intrusion detection system dataset,” In: Journal of Physics: Conference Series, p 012021, 2021, https://doi.org/10.1088/1742-6596/1752/1/012021.

[33] A. Fadhlillah, N. Karna, A. Irawan, "IDS Performance Analysis using Anomaly-based Detection Method for DOS Attack," IEEE International Conference on Internet of Things and Intelligence Systems (IoTaIS)), pp. 18-22, 2021, https://doi.org/10.1109/IoTaIS50849.2021.9359719.

[34] T. Rahmawati, R. W. Shiddiq, M. Sumpena, S. Setiawan, N. Karna, and S. Hertiana, “Web Application Firewall Using Proxy and Security Information and Event Management for OWASP Cyber Attack Detection,” IEEE International Conference on Internet of Things and Intelligence Systems (IoTaIS)), pp. 280–285, Nov. 2023, https://doi.org/10.1109/IoTaIS60147.2023.10346051.

[35] H. Haugerud, H. N. Tran, N. Aitsaadi, and A. Yazidi, “A dynamic and scalable parallel Network Intrusion Detection System using intelligent rule ordering and Network Function Virtualization,” Future Generation Computer Systems, vol. 124, pp. 254–267, Nov. 2021, https://doi.org/10.1016/j.future.2021.05.037.

[36] T. Bajtoš, P. Sokol, and F. Kurimský, “Processing of IDS alerts in multi-step attacks[Formula presented],” Software Impacts, vol. 19, Mar. 2024, https://doi.org/10.1016/j.simpa.2024.100622.

[37] Z. Chiba, N. Abghour, K. Moussaid, O. Lifandali, and R. Kinta, “A Deep Study of Novel Intrusion Detection Systems and Intrusion Prevention Systems for Internet of Things Networks,” in Procedia Computer Science, pp. 94–103, 2022, https://doi.org/10.1016/j.procs.2022.10.124.

[38] T. S. Pooja and P. Shrinivasacharya, “Evaluating neural networks using Bi-Directional LSTM for network IDS (intrusion detection systems) in cyber security,” Global Transitions Proceedings, vol. 2, no. 2, pp. 448–454, Nov. 2021, https://doi.org/10.1016/j.gltp.2021.08.017.

[39] A. Adu-Kyere, E. Nigussie, and J. Isoaho, “Analyzing the effectiveness of IDS/IPS in real-time with a custom in- vehicle design,” in Procedia Computer Science, pp. 175–183, 2024, https://doi.org/10.1016/j.procs.2024.06.013.

[40] M. A. Talukder et al., “Machine learning-based lung and colon cancer detection using deep feature extraction and ensemble learning,” Expert Syst Appl, vol. 205, no. 117, p. 695, 2022, https://doi.org/10.1016/j.eswa.2022.117695.

[41] A. R. Gad, A. A. Nashat, and T. M. Barkat, “Intrusion detection system using machine learning for vehicular ad hoc networks based on ToN-IoT dataset,” IEEE Access, vol. 9, pp. 142206–142217, 2021, https://doi.org/10.1109/ACCESS.2021.3120626.

[42] G. Guo, “An intrusion detection system for the internet of things using machine learning models,” in 3rd International Conference on Big Data, Artificial Intelligence and Internet of Things Engineering (ICBAIE), pp. 332–335, 2022, https://doi.org/10.1109/ICBAIE56435.2022.9985800.

[43] L. Qi, Y. Yang, X. Zhou, W. Rafque, and J. Ma, “Fast anomaly identification based on multiaspect data streams for intelligent intrusion detection toward secure Industry 4.0,” IEEE Transactions on Industrial Informatics, vol. 18, no. 9, pp. 6503–6511, 2022, https://doi.org/10.1109/TII.2021.3139363.

[44] M. Shafq, Z. Tian, A. K. Bashir, X. Du, and M. Guizani, “CorrAUC: A malicious Bot-IoT traffic detection method in IoT network using machine-learning techniques,” IEEE Internet of Things Journal, vol. 8, no. 5, pp. 3242–3254, 2021, https://doi.org/10.1109/JIOT.2020.3002255.

[45] S. I. Popoola, B. Adebisi, M. Hammoudeh, G. Gui, and H. Gacanin, “Hybrid deep learning for botnet attack detection in the internet-of-things networks,” IEEE Internet of Things Journal, vol. 8, no. 6, pp. 4944–4956, 2021, https://doi.org/10.1109/JIOT.2020.3034156.

[46] T.-N. Dao and H. Lee, “Stacked autoencoder-based probabilistic feature extraction for on-device network intrusion detection,” IEEE Internet of Things Journal, vol. 9, no. 16, pp. 14438–14451, 2022, https://doi.org/10.1109/JIOT.2021.3078292.

[47] A. R. Gad, A. A. Nashat, and T. M. Barkat, “Intrusion detection system using machine learning for vehicular ad hoc networks based on ToN-IoT dataset,” IEEE Access, vol. 9, pp. 142206–142217, 2021, https://doi.org/10.1109/ACCESS.2021.3120626.

[48] A. Fatani, A. Dahou, M. A. A. Al-Qaness, S. Lu, and M. A. Abd Elaziz, “Advanced feature extraction and selection approach using deep learning and Aquila optimizer for IoT intrusion detection system,” Sensors, vol. 22, no. 1, p. 140, 2021, https://doi.org/10.3390/s22010140.

[49] J. Liu, D. Yang, M. Lian, and M. Li, “Research on intrusion detection based on particle swarm optimization in IoT,” IEEE Access, vol. 9, pp.s 38254–38268, 2021, https://doi.org/10.1109/ACCESS.2021.3063671.

Downloads

Published

2025-04-28

How to Cite

[1]
R. W. Shiddiq, N. Karna, and I. D. Irawati, “Optimizing Machine Learning-Based Network Intrusion Detection System with Oversampling, Feature Selection and Extraction”, J. Ilm. Tek. Elektro Komput. Dan Inform, vol. 11, no. 2, pp. 225–237, Apr. 2025.

Issue

Section

Articles

Similar Articles

<< < 1 2 3 4 5 6 7 

You may also start an advanced similarity search for this article.