Optimizing Machine Learning-Based Network Intrusion Detection System with Oversampling, Feature Selection and Extraction
DOI:
https://doi.org/10.26555/jiteki.v11i2.30675Keywords:
Machine Learning, Network Intrusion Detection System, Imbalanced Dataset Handling in NIDS, Feature Selection and Extraction, OptunaAbstract
Network security is a global challenge that requires intelligent and efficient solutions. Machine Learning (ML)-based Network Intrusion Detection Systems (NIDS) have been proven to enhance accuracy in detecting cyberattacks. However, the main challenges in implementing ML-based IDS are dataset imbalance and large dataset size. This research addresses these challenges by applying oversampling techniques to balance the dataset, feature selection using random forest to identify the most relevant features, and feature extraction using Principal Component Analysis (PCA) to further reduce the selected important features. Additionally, K-fold cross-validation is used to test the features to minimize bias and ensure the model does not suffer from overfitting, while Optuna is implemented to automatically optimize model parameters for maximum accuracy. Since IDS performance deteriorates with high-dimensional features, the combination of methods used is evaluated based on feature selection applied to the model using datasets wtih 45 features selected from UNSW-NB15, 78 features from CIC-IDS-2017, and 80 features from CIC-IDS-2018 using various ML algorithms. The results demonstrate that the combination technique with feature selection, along with maximum optimization for each model significantly improves performance on large and imbalanced datasets reaching 99% accuracy compared to conventional methods in network traffic analysis.
References
[1] M. Sarhan, S. Layeghy, N. Moustafa, M. Gallagher, and M. Portmann, “Feature extraction for machine learning- based intrusion detection in IoT networks,” Digital Communications and Networks, vol. 10, no. 1, pp. 205–216, Feb. 2024. https://doi.org/10.1016/j.dcan.2022.08.012.
[2] Y. Kim and J. Kim, “CNN-LSTM: Hybrid Deep Neural Network for Network Intrusion Detection System,” in IEEE Transactions on Network and Service Management, vol. 19, no. 4, pp. 5125–5138, Dec. 2023, https://doi.org/ 10.1109/9889698.
[3] R. Kumar and A. K. Singh, “A Deep Learning Approach to Network Intrusion Detection,” in IEEE Access, vol. 30, no. 3, pp. 7856–7868, May 2021, https://doi.org/10.1109/8264962.
[4] S. K. Sharma, V. S. Kushwaha, and T. H. Kim, “IoT Intrusion Detection System Using Deep Learning and Enhanced Transient Search Optimization,” in IEEE Internet of Things Journal, vol. 9, no. 5, pp. 11248–11260, Sept. 2022, https://doi.org/10.1109/9525369.
[5] A. R. Rashed and W. A. Rizk, “Machine Learning-Powered Encrypted Network Traffic Analysis: A Comprehensive Survey,” in IEEE Transactions on Information Forensics and Security, vol. 18, no. 2, pp. 312–329, Feb. 2024, https://doi.org/10.1109/9896143.
[6] M. W. Oh, P. S. Kim, and J. K. Noh, "Ensemble Learning Approach for Network Intrusion Detection using Hybrid Feature Extraction," IEEE Access, vol. 10, pp. 157395-157406, Apr. 2022, https://doi.org/10.1109/ACCESS.2022.3195505.
[7] H. Zhao, X. Sun, and Y. Liu, “A Deep Reinforcement Learning Approach for Anomaly Network Intrusion Detection
System,” in Proceedings of the IEEE International Conference on Communications (ICC), pp. 102–108, Jan. 2023, https://doi.org/10.1109/9335796.
[8] R. A. Disha and S. Waheed, "Performance analysis of machine learning models for intrusion detection system using Gini Impurity-based Weighted Random Forest (GIWRF) feature selection technique" Spring Open Access Journal, vol. 5, no. 1, pp. 1-22, 2022. https://doi.org/10.1186/s42400-021-00103-8.
[9] M. S. El-Masri, E. E. El-Alfy, and A. A. M. Sayad, "A Hybrid Machine Learning Framework for Intrusion Detection in IoT Systems," in Proceedings of the IEEE International Conference on Industrial Informatics (INDIN), pp. 12-17, 2022, https://doi.org/10.1109/INDIN49073.2022.9622198.
[10] Y. Zhang, C. Zhang, and X. Wang, "A Review of Machine Learning Methods for Network Intrusion Detection Systems," IEEE Transactions on Neural Networks and Learning Systems, vol. 34, no. 2, pp. 1183-1195, Feb. 2023, https://doi.org/10.1109/TNNLS.2022.3204521.
[11] M. A. M. Hossain, M. R. S. S. Sayeed, and M. M. Rahman, “Deep Learning-based Intrusion Detection System for Secure IoT Networks,” in IEEE International Conference on Smart Computing (SMARTCOMP), pp. 208-214, 2022, https://doi.org/10.1109/SMARTCOMP53562.2022.9782321.
[12] F. Hussain et al., “Machine learning in iot security: current solutions and future challenges,” IEEE Commun Surv Tutor, vol. 22, no. 3,pp. 1686–1721, 2020, https://doi.org/10.1109/COMST.2020.2986444.
[13] M. Talukder et al., “Machine learning-based network intrusion detection for big and imbalanced data using oversampling, stacking feature embedding and feature extraction,” Journal of Big Data, vol. 11, 2024, https://doi.org/10.1186/s40537-024-00886-w.
[14] P. L. S. Jayalaxmi, R. Saha, G. Kumar, M. Conti, T. H. Kim, “Machine and Deep Learning Solutions for Intrusion Detection and Prevention in IoTs: A Survey,” IEEE Access, vol. 10, pp. 121173-121192, 2022, https://doi.org/10.1109/ACCESS.2022.3220622.
[15] S. Khan, F. R. Khan, and N. Anwar, "Machine Learning-based Intrusion Detection System for IoT: A Survey and Comparative Study," in IEEE Access, vol. 9, pp. 95073-95087, Jul. 2021. https://doi.org/10.1109/ACCESS.2021.3099298.
[16] L. Zhang, Y. Wang, and S. Liu, "A Deep Learning Approach for Network Intrusion Detection System using Recurrent Neural Networks," IEEE Access, vol. 9, pp. 154693-154702, Oct. 2021. https://doi.org/10.1109/ACCESS.2021.3115683.
[17] W. Liu, M. H. Anwar, and H. T. H. Nguyen, “Real-time Network Intrusion Detection using Convolutional Neural Networks,” in IEEE Transactions on Network and Service Management, vol. 18, no. 5, pp. 4629-4639, May 2022. https://doi.org/10.1109/TNSM.2022.3153501.
[18] R. K. Gupta, S. D. Bhatti, and K. S. Bawa, "A Comprehensive Survey on Feature Selection for Intrusion Detection Systems," in IEEE Transactions on Information Forensics and Security, vol. 17, no. 3, pp. 196-208, Mar. 2022. https://doi.org/10.1109/TIFS.2021.3071781.
[19] N. K. Meena, M. S. Soni, and S. A. Rathore, "Deep Neural Networks-based Network Intrusion Detection System for High-Dimensional Traffic Data," IEEE Access, vol. 9, pp. 105872-105881, Jun. 2021.
[20] A. Zarei, A. Mozaffari, and M. S. M. Sajadi, “Deep Learning Methods for Network Traffic Analysis and Intrusion Detection Systems,” IEEE Transactions on Cybernetics, vol. 53, no. 8, pp. 3921-3931, Aug. 2023. https://doi.org/10.1109/TCYB.2022.3205680.
[21] Z. H. U. Guowei et al., “Research on network intrusion detection method of power system based on random forest algorithm.” 13th international conference on Measuriing Technology and Mechatronics Automation (ICMTMA). p. 374-379, 2021, https://doi.org/10.1109/ICMTMA52658.2021.00087.
[22] S. W. Kim, H. S. Kim, and B. D. Lee, "An Effective Intrusion Detection System Using Ensemble Learning for Network Traffic Classification," in Proceedings of the IEEE International Conference on Communications (ICC), pp. 1031-1036, 2022, https://doi.org/10.1109/ICC45636.2022.9836852.
[23] K. R. H. S. Gummadi, Y. K. S. Reddy, and R. H. Raj, “Machine Learning-Based Classification of Malicious Network Traffic,” in IEEE Transactions on Systems, Man, and Cybernetics: Systems, vol. 50, no. 5, pp. 1701-1709, May 2022. https://doi.org/10.1109/TSMC.2022.3208749.
[24] A. S. L. Xie, T. C. Y. Chan, and Z. S. Lin, “Comparing Deep Learning Approaches for NIDS on Cloud Networks,” in Proceedings of the IEEE International Symposium on Cloud Computing and Big Data, pp. 121-126, 2021, https://doi.org/10.1109/ISCCBD53030.2021.9745523.
[25] S. Moualla, K. Khorzo, A. Jafar, “Improving the performance of machine learning-based network intrusion detection systems on the UNSW-NB15 dataset,” Comput Intel Neurosci. pp. 1–13, 2021, https://doi.org/10.1155/2021/5557577.
[26] S. M. Kasongo, Y. Sun, “Performance analysis of intrusion detection systems using a feature selection method on the unsw-nb15 dataset,” J Big Data. vol. 7, no. 1, pp. 1–20, 2020, https://doi.org/10.1186/s40537-020-00379-6.
[27] P. Nimbalkar, D. Kshirsagar, “Feature selection for intrusion detection system in internet-of-things (IOT),” ICT Express, vol. 7, no. 2, pp. 177–181, 2021, https://doi.org/10.1016/j.icte.2021.04.012.
[28] P. S. Hwang, Y. D. Lee, and T. W. Choi, "Evaluating Machine Learning Algorithms for Real-Time Intrusion Detection," in Proceedings of the IEEE International Conference on Computational Intelligence (ICCI), pp. 1158-1163, 2021, https://doi.org/10.1109/ICCI53462.2021.00022.
[29] M. Ahmad et al., “Intrusion detection in internet of things using supervised machine learning based on application and transport layer features using unsw-nb15 data-set,” J Wirel Commun Netw. pp. 1-23, 2021, https://doi.org/10.1186/s13638-021-01893-8.
[30] D. Kshirsagar, S. Kumar, “An efficient feature reduction method for the detection of DoS attack,” ICT Express. vol. 7, no. 3, pp. 371–375, 2021, https://doi.org/10.1016/j.icte.2020.12.006.
[31] E. Mugabo et al., “Intrusion detection method based on mapreduce for evolutionary feature selection in mobile cloud computing,” Int J Netw Secur, vol. 23, no. 1, pp. 106–115, 2021,
https://doi.org/10.6633/IJNS.202101_23(1).13.
[32] A. Talita, O. Nataza, Z. Rustam, “Naïve bayes classifier and particle swarm optimization feature selection method for classifying intrusion detection system dataset,” In: Journal of Physics: Conference Series, p 012021, 2021, https://doi.org/10.1088/1742-6596/1752/1/012021.
[33] A. Fadhlillah, N. Karna, A. Irawan, "IDS Performance Analysis using Anomaly-based Detection Method for DOS Attack," IEEE International Conference on Internet of Things and Intelligence Systems (IoTaIS)), pp. 18-22, 2021, https://doi.org/10.1109/IoTaIS50849.2021.9359719.
[34] T. Rahmawati, R. W. Shiddiq, M. Sumpena, S. Setiawan, N. Karna, and S. Hertiana, “Web Application Firewall Using Proxy and Security Information and Event Management for OWASP Cyber Attack Detection,” IEEE International Conference on Internet of Things and Intelligence Systems (IoTaIS)), pp. 280–285, Nov. 2023, https://doi.org/10.1109/IoTaIS60147.2023.10346051.
[35] H. Haugerud, H. N. Tran, N. Aitsaadi, and A. Yazidi, “A dynamic and scalable parallel Network Intrusion Detection System using intelligent rule ordering and Network Function Virtualization,” Future Generation Computer Systems, vol. 124, pp. 254–267, Nov. 2021, https://doi.org/10.1016/j.future.2021.05.037.
[36] T. Bajtoš, P. Sokol, and F. Kurimský, “Processing of IDS alerts in multi-step attacks[Formula presented],” Software Impacts, vol. 19, Mar. 2024, https://doi.org/10.1016/j.simpa.2024.100622.
[37] Z. Chiba, N. Abghour, K. Moussaid, O. Lifandali, and R. Kinta, “A Deep Study of Novel Intrusion Detection Systems and Intrusion Prevention Systems for Internet of Things Networks,” in Procedia Computer Science, pp. 94–103, 2022, https://doi.org/10.1016/j.procs.2022.10.124.
[38] T. S. Pooja and P. Shrinivasacharya, “Evaluating neural networks using Bi-Directional LSTM for network IDS (intrusion detection systems) in cyber security,” Global Transitions Proceedings, vol. 2, no. 2, pp. 448–454, Nov. 2021, https://doi.org/10.1016/j.gltp.2021.08.017.
[39] A. Adu-Kyere, E. Nigussie, and J. Isoaho, “Analyzing the effectiveness of IDS/IPS in real-time with a custom in- vehicle design,” in Procedia Computer Science, pp. 175–183, 2024, https://doi.org/10.1016/j.procs.2024.06.013.
[40] M. A. Talukder et al., “Machine learning-based lung and colon cancer detection using deep feature extraction and ensemble learning,” Expert Syst Appl, vol. 205, no. 117, p. 695, 2022, https://doi.org/10.1016/j.eswa.2022.117695.
[41] A. R. Gad, A. A. Nashat, and T. M. Barkat, “Intrusion detection system using machine learning for vehicular ad hoc networks based on ToN-IoT dataset,” IEEE Access, vol. 9, pp. 142206–142217, 2021, https://doi.org/10.1109/ACCESS.2021.3120626.
[42] G. Guo, “An intrusion detection system for the internet of things using machine learning models,” in 3rd International Conference on Big Data, Artificial Intelligence and Internet of Things Engineering (ICBAIE), pp. 332–335, 2022, https://doi.org/10.1109/ICBAIE56435.2022.9985800.
[43] L. Qi, Y. Yang, X. Zhou, W. Rafque, and J. Ma, “Fast anomaly identification based on multiaspect data streams for intelligent intrusion detection toward secure Industry 4.0,” IEEE Transactions on Industrial Informatics, vol. 18, no. 9, pp. 6503–6511, 2022, https://doi.org/10.1109/TII.2021.3139363.
[44] M. Shafq, Z. Tian, A. K. Bashir, X. Du, and M. Guizani, “CorrAUC: A malicious Bot-IoT traffic detection method in IoT network using machine-learning techniques,” IEEE Internet of Things Journal, vol. 8, no. 5, pp. 3242–3254, 2021, https://doi.org/10.1109/JIOT.2020.3002255.
[45] S. I. Popoola, B. Adebisi, M. Hammoudeh, G. Gui, and H. Gacanin, “Hybrid deep learning for botnet attack detection in the internet-of-things networks,” IEEE Internet of Things Journal, vol. 8, no. 6, pp. 4944–4956, 2021, https://doi.org/10.1109/JIOT.2020.3034156.
[46] T.-N. Dao and H. Lee, “Stacked autoencoder-based probabilistic feature extraction for on-device network intrusion detection,” IEEE Internet of Things Journal, vol. 9, no. 16, pp. 14438–14451, 2022, https://doi.org/10.1109/JIOT.2021.3078292.
[47] A. R. Gad, A. A. Nashat, and T. M. Barkat, “Intrusion detection system using machine learning for vehicular ad hoc networks based on ToN-IoT dataset,” IEEE Access, vol. 9, pp. 142206–142217, 2021, https://doi.org/10.1109/ACCESS.2021.3120626.
[48] A. Fatani, A. Dahou, M. A. A. Al-Qaness, S. Lu, and M. A. Abd Elaziz, “Advanced feature extraction and selection approach using deep learning and Aquila optimizer for IoT intrusion detection system,” Sensors, vol. 22, no. 1, p. 140, 2021, https://doi.org/10.3390/s22010140.
[49] J. Liu, D. Yang, M. Lian, and M. Li, “Research on intrusion detection based on particle swarm optimization in IoT,” IEEE Access, vol. 9, pp.s 38254–38268, 2021, https://doi.org/10.1109/ACCESS.2021.3063671.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 Rama Wijaya Shiddiq

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Authors who publish with JITEKI agree to the following terms:
- Authors retain copyright and grant the journal the right of first publication with the work simultaneously licensed under a Creative Commons Attribution License (CC BY-SA 4.0) that allows others to share the work with an acknowledgment of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgment of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work.
This work is licensed under a Creative Commons Attribution 4.0 International License