Using Graph Neural Networks and CatBoost for Internet Security Prediction with SMOTE

Authors

  • Aswan Supriyadi Sunge Informatics Engineering Department, Pelita Bangsa University, Bekasi, Indonesia
  • Spits Warnars Harco Leslie Hendric Computer Science Department, Graduate Program-Doctor of Computer Science, Bina Nusantara University, Jakarta, Indonesia
  • Dendy K. Pramudito Informatics Engineering Department, Pelita Bangsa University, Bekasi, Indonesia

DOI:

https://doi.org/10.26555/jiteki.v10i4.30157

Keywords:

Predictions, Website, Security, CatBoost, GNNs

Abstract

Internet security is the most important issue in cyberspace, on the other hand, cybercrime occurs, and the most serious threat is the theft of personal data and its misuse for the benefit of others. Although cyberspace is while internet security cannot eliminate all risks, predictive models can significantly reduce cybercrime by identifying vulnerabilities if you know how to prevent it. One of the most important things is that many internet users do not know what measures are used to avoid and whether it is safe to visit or explore, on the other hand, in system development existing studies on internet security prediction often rely on generic models that lack precision in identifying influential features or ensuring class balance in developing internet security. In this case, Deep Learning (DL) helps learn patterns from recorded data, find relevant patterns, and use the model effectively. The purpose of this study is to identify the most influential features in internet security and evaluate the effectiveness of advanced machine learning models, such as Graph Neural Networks (GNNs) and Categorical Boosting (CatBoost), for predicting internet safety. So far other studies have tested the entire data set and used a model that is generally. This is expected to lead to the design or development of systems and programs that are useful for internet security. The study used a dataset of 11,055 records with 30 features and binary classification labels ('Safe' and 'Not Safe'). To address the class imbalance, SMOTE was applied before splitting the data into training and testing sets. In testing the Graph Neural Networks (GNNs) model achieved 93.58% accuracy, 93.63% precision, 93.58% recall, and 93.55% F1-score, demonstrating its effectiveness for internet security prediction. From the results of testing the CatBoost model was used to identify key features, revealing that the 'URL of Anchor,' 'SSLFinal State,' and 'Web Traffic' have the most significant impact. From the experiments conducted, the CatBoost effectively identified features with the highest on prediction accuracy, and the GNNs model is very accurate and precise for developing applications or systems to predict internet security.

Author Biography

Aswan Supriyadi Sunge, Informatics Engineering Department, Pelita Bangsa University, Bekasi, Indonesia

Informatics Engineering Department

References

[1] T. T. Kwon et al, "How to decentralize the internet: A focus on data consolidation and user privacy,” Computer Networks, vol. 234, p. 109911, 2023, https://doi.org/10.1016/j.comnet.2023.109911.

[2] A. Szymkowiak, et al., “Information technology and Gen Z: The role of teachers, the internet, and technology in the education of young people,” Technology in Society, vol. 65, p. 101565, 2021, https://doi.org/10.1016/j.techsoc.2021.101565.

[3] A. Roukounaki et al, "Scalable and Configurable End-to-End Collection and Analysis of IoT Security Data: Towards End-to-End Security in IoT Systems," Global IoT Summit (GIoTS), pp. 1-6, 2019, https://doi.org/10.1109/GIOTS.2019.8766407.

[4] M. Alazab, S. Hong, and J. Ng, “Louder bark with no bite: Privacy protection through the regulation of mandatory data breach notification in Australia,” Future Generation Computer Systems, vol. 116, pp. 22-29, 2021, https://doi.org/10.1016/j.future.2020.10.017.

[5] W. Li, and Z. Yang, “Landscape design of urban culture transmission based on the regional information security of Internet of Things,” Heliyon, vol. 10, no. 15, p. e35042, 2024, https://doi.org/10.1016/j.heliyon.2024.e35042.

[6] J. B. B. Pea-Assounga et al, “Effect of financial innovation and stakeholders' satisfaction on investment decisions: does internet security matter?,” Heliyon, vol, 10, no. 6, p. e27242, 2024, https://doi.org/10.1016/j.heliyon.2024.e27242.

[7] L. Tawalbeh et al, “IoT Privacy and Security: Challenges and Solutions,” Applied Sciences. vol. 10, no. 12, p. 4102, 2020; https://doi.org/10.3390/app10124102.

[8] P. R. Kanna, and P. Santhi, “Exploring the landscape of network security: a comparative analysis of attack detection strategies,” J Ambient Intell Human Comput, vol. 15, pp. 3211–3228, 2024, https://doi.org/10.1007/s12652-024-04794-y.

[9] S. J. Holmen, “Situational Crime Prevention, Advice Giving, and Victim-Blaming,” Philosophia, vol. 52, pp. 325–340, 2024, https://doi.org/10.1007/s11406-024-00729-1.

[10] I. Chenchev, “Framework for Multi-factor Authentication with Dynamically Generated Passwords,” Advances in Information and Communication. FICC, Lecture Notes in Networks and Systems, vol 652, 2023, https://doi.org/10.1007/978-3-031-28073-3_39.

[11] A. Girma, M. A. Guo, and J. Irungu, “Identifying Shared Security Vulnerabilities and Mitigation Strategies at the Intersection of Application Programming Interfaces (APIs), Application-Level and Operating System (OS) of Mobile Devices,” Proceedings of the Future Technologies Conference (FTC), Lecture Notes in Networks and Systems, vol 560, 2022, https://doi.org/10.1007/978-3-031-18458-1_34.

[12] A. M. Sakshi, and A. K, Sharma, “A survey on blockchain based IoT forensic evidence preservation: research trends and current challenges,” Multimed Tools Appl, vol. 83, pp. 42413–42458, 2024, https://doi.org/10.1007/s11042-023-17104-z.

[13] P. Victor et al, “IoT malware: An attribute-based taxonomy, detection mechanisms and challenges,” Peer-to-Peer Netw. Appl. vol. 16, pp. 1380–1431, 2023, https://doi.org/10.1007/s12083-023-01478-w.

[14] S.Rudrakar, and P, Rughani, “IoT based Agriculture (Ag-IoT): A detailed study on Architecture, Security and Forensics, Information,” Processing in Agriculture, 2023, https://doi.org/10.1016/j.inpa.2023.09.002.

[15] R. Kumar et al, “Machine and deep learning methods for concrete strength Prediction: A bibliometric and content analysis review of research trends and future directions,” Applied Soft Computing, vol. 164, p. 111956, 2024, https://doi.org/10.1016/j.asoc.2024.111956.

[16] J. Sun et al, “Hybrid deep learning approach for rock tunnel deformation prediction based on spatio-temporal patterns,” Underground Space, vol. 20, pp. 100-118, 2024, https://doi.org/10.1016/j.undsp.2024.04.008.

[17] F. Alhaek et al, “Learning spatial patterns and temporal dependencies for traffic accident severity prediction: A deep learning approach,” Knowledge-Based Systems, vol. 286, p. 111406, 2024, https://doi.org/10.1016/j.knosys.2024.111406.

[18] G. Zare, N. J. Navimipour, M. Hosseinzadeh, and A. Sahafi, “Network link prediction via deep learning method: A comparative analysis with traditional methods,” Engineering Science and Technology, an International Journal, vol. 56, p. 101782, 2024, https://doi.org/10.1016/j.jestch.2024.101782.

[19] D. V. Nguyen, Y. Choo, and D. Kim, “Deep learning application for nonlinear seismic ground response prediction based on centrifuge test and numerical analysis,” Soil Dynamics and Earthquake Engineering, vol. 182, p. 108733, 2024, https://doi.org/10.1016/j.soildyn.2024.108733.

[20] H. Jebnoun et al, “Clones in deep learning code: what, where, and why?,” Empir Software Eng, vol. 27, no. 4, p. 84, 2022. https://doi.org/10.1007/s10664-021-10099-x.

[21] L, Wang, Z. Zhu, dan X, Zhao, “Dynamic predictive maintenance strategy for system remaining useful life prediction via deep learning ensemble method,” Reliability Engineering & System Safety, vol. 245, p. 110012, 2024, https://doi.org/10.1016/j.ress.2024.110012.

[22] G. E. Vadivu, and T. Muthusamy, “Synthesis of deep learning technique for social distance monitoring in pandemic areas,” Multimed Tools Appl, vol. 83, pp. 30361–30376, 2024, https://doi.org/10.1007/s11042-023-16681-3.

[23] U. H. Atasever, and E, Tercan, “Deep learning-based burned forest areas mapping via Sentinel-2 imagery: a comparative study,” Environ Sci Pollut Res, vol. 31, pp. 5304–5318, 2024, https://doi.org/10.1007/s11356-023-31575-5.

[24] S. Yang et al, “Improving Mapping Accuracy of Smallholder Potato Planting Areas by Embedding Prior Knowledge into a Novel Multi-temporal Deep Learning Network” Potato Res, pp. 1-31, 2024, https://doi.org/10.1007/s11540-024-09769-2.

[25] K. Sharma, G. K. Sethi, and R. K. Bawa, “A comparative analysis of deep learning and deep transfer learning approaches for identification of rice varieties.” Multimed Tools Appl, pp. 1-18, 2024, https://doi.org/10.1007/s11042-024-19126-7.

[26] T. Wang et al, “COFNet: A deep learning model to predict the specific surface area of covalent-organic frameworks using structural images and statistic features,” Chemical Physics Letters, vol. 847, p. 141383, 2024, https://doi.org/10.1016/j.cplett.2024.141383.

[27] Qi Liao et al, “Probing the capacity of a spatiotemporal deep learning model for short-term PM2.5 forecasts in a coastal urban area,” Science of The Total Environment, vol, 950, p. 175233, 2024, https://doi.org/10.1016/j.scitotenv.2024.175233.

[28] R. E. Nogales, and M. E. Benalcázar, “Analysis and Evaluation of Feature Selection and Feature Extraction Methods,” Int J Comput Intell Syst, vol. 16, p. 153, 2023, https://doi.org/10.1007/s44196-023-00319-1.

[29] B. Beceiro et al, “CUDA acceleration of MI-based feature selection methods,” Journal of Parallel and Distributed Computing, vol. 190, p. 104901, 2024, https://doi.org/10.1016/j.jpdc.2024.104901.

[30] K. Zhang et al, “Enhancing IoT (Internet of Things) feature selection: A two-stage approach via an improved whale optimization algorithm,” Expert Systems with Applications, vol. 256, p. 124936, 2024, https://doi.org/10.1016/j.eswa.2024.124936.

[31] A. Moslemi, and M. Bidar, “Dual-dual subspace learning with low-rank consideration for feature selection,” Physica A: Statistical Mechanics and its Applications, vol. 651, p. 129997, 2024, https://doi.org/10.1016/j.physa.2024.129997.

[32] K. Okoye, and S. Hosseini, “Correlation Tests in R: Pearson Cor, Kendall’s Tau, and Spearman’s Rho,” In: R Programming. Springer, pp. 247-277, 2024, https://doi.org/10.1007/978-981-97-3385-9_12.

[33] P. Lin et al, “An Intelligent Depth Correction Method for Logging Curves Based on Pearson Correlation Coefficient and DTW,” Proceedings of the International Field Exploration and Development Conference, pp. 102-114, 2023, https://doi.org/10.1007/978-981-97-0479-8_8.

[34] R. Okunev, “Pearson Correlation and Using the Excel Linear Trend Equation and Excel Regression Output,” In: Analytics for Retail, pp. 83-106, 2022, https://doi.org/10.1007/978-1-4842-7830-7_8.

[35] B.F. Darst, K. C. Malecki, and C. D. Engelman, “Using recursive feature elimination in random forest to account for correlated variables in high dimensional data,” BMC Genet, vol. 19, pp. 1-6, 2018, https://doi.org/10.1186/s12863-018-0633-8.

[36] N. R. Abid-Althaqafi, and H. A. Alsalamah, “The Effect of Feature Selection on the Accuracy of X-Platform User Credibility Detection with Supervised Machine Learning,” Electronics, vol. 13, no. 1, p. 20, 2024, https://doi.org/10.3390/electronics13010205.

[37] C. Fraser, “Association Between Categorical Variables: Contingency Analysis with Chi Square,” In: Business Statistics for Competitive Advantage with Excel and JMP, 2024, https://doi.org/10.1007/978-3-031-42555-4_3

[38] B. K. Das et al, “Square Test of Significance,” In: Concept Building in Fisheries Data Analysis, Springer, pp. 81-942022, 2022, https://doi.org/10.1007/978-981-19-4411-6_5.

[39] D. Al-Shammary et al, “Efficient ECG classification based on Chi-square distance for arrhythmia detection,” Journal of Electronic Science and Technology, vol. 22, no. 2, p. 100249, 2024, https://doi.org/10.1016/j.jnlest.2024.100249.

[40] C. Yang et al, “How can SHAP (SHapley Additive exPlanations) interpretations improve deep learning based urban cellular automata model?,” Computers, Environment and Urban Systems, vol. 111, p. 102133, 2024, https://doi.org/10.1016/j.compenvurbsys.2024.102133.

[41] A. S. Antonini et al, “Machine Learning model interpretability using SHAP values: Application to Igneous Rock Classification task,” Applied Computing and Geosciences, vol. 23, p. 100178, 2024, https://doi.org/10.1016/j.acags.2024.100178.

[42] X. Cheng et al, “Predicting response to CCRT for esophageal squamous carcinoma by a radiomics-clinical SHAP model,” BMC Med Imaging, vol. 23, no. 1, p. 145, 2023, https://doi.org/10.1186/s12880-023-01089-0.

[43] J. Shin, “Feasibility of local interpretable model-agnostic explanations (LIME) algorithm as an effective and interpretable feature selection method: comparative fNIRS study,” Biomed. Eng. Lett. vol. 13, pp. 689–703, 2023, https://doi.org/10.1007/s13534-023-00291-x.

[44] C. Hsu et al, “Artificial Intelligence Model Interpreting Tools: SHAP, LIME, and Anchor Implementation in CNN Model for Hand Gestures Recognition,” In: Technologies and Applications of Artificial Intelligence. Communications in Computer and Information Science, vol. 2074, 2023, https://doi.org/10.1007/978-981-97-1711-8_2.

[45] T. V. Krishnamoorthy et al, “A novel NASNet model with LIME explanability for lung disease classification,” Biomedical Signal Processing and Control, vol. 93, pp. 106114, 2024, https://doi.org/10.1016/j.bspc.2024.106114.

[46] G. Manikandan et al, “Classification models combined with Boruta feature selection for heart disease prediction,” Informatics in Medicine Unlocked, vol. 44, p. 101442, 2024, https://doi.org/10.1016/j.imu.2023.101442.

[47] H. Luo et al, “SHAP based predictive modeling for year all-cause readmission risk in elderly heart failure patients: feature selection and model interpretation,” Sci Rep, vol. 14, p. 17728, 2024, https://doi.org/10.1038/s41598-024-67844-7.

[48] F. Türk, “Investigation of machine learning algorithms on heart disease through dominant feature detection and feature selection,” SIViP, vol. 18, pp. 3943–3955, 2024, https://doi.org/10.1007/s11760-024-03060-0.

[49] A. Dardzińska-Głębocka, and M. Zdrodowska, “Analysis children with disabilities self-care problems based on selected data mining techniques,” Procedia Computer Science, vol. 192, pp. 2854-2862, 2021, https://doi.org/10.1016/j.procs.2021.09.056.

[50] Md. S. H. Shaon et al, “A comparative study of machine learning models with LASSO and SHAP feature selection for breast cancer prediction,” Healthcare Analytics, vol. 6, p. 100353, 2024, https://doi.org/10.1016/j.health.2024.100353.

[51] H. Chereda, A. Leha, and T. Beibarth, “Stable feature selection utilizing Graph Convolutional Neural Network and Layer-wise Relevance Propagation for biomarker discovery in breast cancer,” Artificial Intelligence in Medicine, vol. 151, p. 102840, 2024, https://doi.org/10.1016/j.artmed.2024.102840.

[52] A. Et-touri et al, “Comparison of Feature Selection Methods for Breast Cancer Prediction,” International Conference on Advanced Intelligent Systems for Sustainable Development (AI2SD'2023). pp. 272-282, 2023, 2023, https://doi.org/10.1007/978-3-031-54318-0_23.

[53] S. P. Jakhar et al, “Brain tumor detection with multi-scale fractal feature network and fractal residual learning,” Applied Soft Computing, vol. 153, p. 111284, 2024, https://doi.org/10.1016/j.asoc.2024.111284.

[54] X. Liu et al, “A hierarchical attention-based feature selection and fusion method for credit risk assessment,” Future Generation Computer Systems, vol. 160, pp. 537-546, 2024, https://doi.org/10.1016/j.future.2024.06.036.

[55] Y. Zhao et al, “Carbon futures price forecasting based on feature selection,” Engineering Applications of Artificial Intelligence, vol. 135, p. 108646, 2024, https://doi.org/10.1016/j.engappai.2024.108646.

[56] J. Wang, and Y. Dong, “An interpretable deep learning multi-dimensional integration framework for exchange rate forecasting based on deep and shallow feature selection and snapshot ensemble technology,” Engineering Applications of Artificial Intelligence, vol. 133, Part C, p. 108282, 2024, https://doi.org/10.1016/j.engappai.2024.108282.

[57] H. Eskandari et al, “Innovative framework for accurate and transparent forecasting of energy consumption: A fusion of feature selection and interpretable machine learning,” Applied Energy, vol. 366, p. 123314, 2024, https://doi.org/10.1016/j.apenergy.2024.123314.

[58] Q. Qiao et al, “An interpretable multi-stage forecasting framework for energy consumption and CO2 emissions for the transportation sector,” Energy, vol. 286, p. 129499, 2024, https://doi.org/10.1016/j.energy.2023.129499.

[59] M. Sharma et al, “Ensemble learning for prominent feature selection and electric power prediction in agriculture sector,” Multimed Tools Appl, pp. 1-28, 2024, https://doi.org/10.1007/s11042-024-18179-y.

[60] M. Radwan et al, “Potato Leaf Disease Classification Using Optimized Machine Learning Models and Feature Selection Technique,” Potato Res, pp. 1-25, 2024, https://doi.org/10.1007/s11540-024-09763-8

[61] W. Cao et al, “A STAM-LSTM model for wind power prediction with feature selection,” Energy, vol. 296, p. 131030, 2024, https://doi.org/10.1016/j.energy.2024.131030.

[62] G. Nasreen et al, “Email spam detection by deep learning models using novel feature selection technique and BERT,” Egyptian Informatics Journal, vol. 26, p. 100473, 2024, https://doi.org/10.1016/j.eij.2024.100473.

[63] G. Kapoor, and N. Wichitaksorn, “Electricity price forecasting in New Zealand: A comparative analysis of statistical and machine learning models with feature selection,” Applied Energy, vol. 347, p. 121446, 2023, https://doi.org/10.1016/j.apenergy.2023.121446.

[64] D. Sagar, and M. Saidireddy, “Security Measurement in LTE/LTE-A Network Based on zS-LR Feature Selection Technique and UM-tGAN Attack Detection Technique,” Expert Systems with Applications, vol. 231, p. 120703, 2023, https://doi.org/10.1016/j.eswa.2023.120703.

[65] R. Yadav, I. Sreedevi, and D. Gupta, “Augmentation in performance and security of WSNs for IoT applications using feature selection and classification techniques,” Alexandria Engineering Journal, vol. 65, pp. 461-473, 2023, https://doi.org/10.1016/j.aej.2022.10.033.

[66] Q. B. Baker, and A. Samarneh, “Feature selection for IoT botnet detection using equilibrium and Battle Royale Optimization,” Computers & Security, vol. 147, p. 104060, 2024, https://doi.org/10.1016/j.cose.2024.104060.

[67] V. Roblekasas et al, “The Interaction between Internet, Sustainable Development, and Emergence of Society 5.0,” Data, vol. 5, p. 80, 2020, https://doi.org/10.3390/data5030080.

[68] R. Mohan, “The effect of population growth, the pattern of demand and of technology on the process of urbanization,” Journal of Urban Economics, vol. 15, no. 2, pp. 125-156, 1984, https://doi.org/10.1016/0094-1190(84)90011-1.

[69] M. Lubis, and D. O. D. Handayani, “The relationship of personal data protection towards internet addiction: Cybercrimes, pornography and reduced physical activity,” Procedia Computer Science, vol. 197, pp.151-161, 2022, https://doi.org/10.1016/j.procs.2021.12.129.

[70] R. Ayachi, Y. Said, and A. B. Abdellali, “Pedestrian Detection Based on Light-Weighted Separable Convolution for Advanced Driver Assistance Systems,” Neural Process Lett, vol. 52, pp. 2655–2668, 2020, https://doi.org/10.1007/s11063-020-10367-9.

[71] M. Afif et al, “A Transfer Learning Approach for Indoor Object Identification,” SN COMPUT. SCI, vol. 2, p. 424, 2021, https://doi.org/10.1007/s42979-021-00790-7.

[72] R. Ayachi et al, “Traffic Signs Detection for Real-World Application of an Advanced Driving Assisting System Using Deep Learning,” Neural Process Lett, vol. 51, pp. 837–851, 2020, https://doi.org/10.1007/s11063-019-10115-8.

[73] Y. Said et al, “Medical Images Segmentation for Lung Cancer Diagnosis Based on Deep Learning Architectures,” Diagnostics. vol. 13, no. 3, p. 546, 2023, https://doi.org/10.3390/diagnostics13030546.

[74] F. Mohammad, S. Al-Ahmadi, and J. Al-Muhtadi, “Deep Learning Based Cyber Event Detection from Open-Source Re-Emerging Social Data,” Computers, Materials and Continua, vol. 76, no. 2, pp. 1423-1438, 2023, https://doi.org/10.32604/cmc.2023.035741.

[75] M. Alshehri et al, “Character-level word encoding deep learning model for combating cyber threats in phishing URL detection,” Computers and Electrical Engineering, vol. 100, p. 107868, 2022, https://doi.org/10.1016/j.compeleceng.2022.107868.

[76] D. Chen, P. Wawrzynski, and Z. Lv, “Cyber security in smart cities: A review of deep learning-based applications and case studies,” Sustainable Cities and Society, vol. 66, p. 102655, 2021, https://doi.org/10.1016/j.scs.2020.102655.

[77] Y. Sun et al, “GTC: GNN-Transformer co-contrastive learning for self-supervised heterogeneous graph representation,” Neural Networks, vol. 181, p. 106645, 2024, https://doi.org/10.1016/j.neunet.2024.106645.

[78] H. A. Mohamed et al, “Locality-aware subgraphs for inductive link prediction in knowledge graphs,” Pattern Recognition Letters, vol. 167, pp. 90-97, 2023, https://doi.org/10.1016/j.patrec.2023.02.004.

[79] X. Li et al, “Table Structure Recognition and Form Parsing by End-to-End Object Detection and Relation Parsing, Pattern Recognition,” vol. 132, p. 108946, 2022, https://doi.org/10.1016/j.patcog.2022.108946.

[80] N. Das et al, “Integrating sentiment analysis with graph neural networks for enhanced stock prediction: A comprehensive survey,” Decision Analytics Journal, vol. 10, p. 100417, 2024, https://doi.org/10.1016/j.dajour.2024.100417.

[81] I. D. Mienye, T. G. Swart, and G. Obaido, “Recurrent Neural Networks: A Comprehensive Review of Architectures, Variants, and Applications,” Information, vol. 15, no. 9, p. 517, 2024, https://doi.org/10.3390/info15090517.

[82] L. Alzubaidi et al, “Review of deep learning: concepts, CNN architectures, challenges, applications, future directions,” J Big Data, vol. 8, p. 53, 2021, https://doi.org/10.1186/s40537-021-00444-8.

[83] R. Zhao et al, “A two-stage CFD-GNN approach for efficient steady-state prediction of urban airflow and airborne contaminant dispersion,” Sustainable Cities and Society, vol. 112, p. 105607, 2024, https://doi.org/10.1016/j.scs.2024.105607.

[84] A. A. Makhdnomi, and I. A. Gillani, “GNN-based passenger request prediction,” Transportation Letters, pp. 1-15, 2024, https://doi.org/10.1080/19427867.2023.2283949.

[85] Y. Lei et al, “GNN-fused CapsNet with multi-head prediction for diabetic retinopathy grading,” Engineering Applications of Artificial Intelligence, vol. 133, Part A, p. 107994, 2024, https://doi.org/10.1016/j.engappai.2024.107994.

[86] M. Farreras et al, “Improving Network Delay Predictions Using GNNs,” J Netw Syst Manage, vol. 31, p. 65, 2023, https://doi.org/10.1007/s10922-023-09758-9.

[87] M. Davidson, and D. Moodley, “ST-GNNs for Weather Prediction in South Africa,” In Southern African Conference for Artificial Intelligence Research, pp. 93-107, 2022, https://doi.org/10.1007/978-3-031-22321-1_7

[88] N. Q. K. Le, “Predicting emerging drug interactions using GNNs,” Nat Comput Sci, vol. 3, pp. 1007–1008, 2023, https://doi.org/10.1038/s43588-023-00555-7.

[89] Q, Dang, “Detecting Obfuscated Malware Using Graph Neural Networks,” Power Engineering and Intelligent Systems. PEIS, 2023. Lecture Notes in Electrical Engineering, pp. 15-25, 2023, https://doi.org/10.1007/978-981-99-7216-6_2.

[90] M. M. El-Gayar et al, “A novel approach for detecting deep fake videos using graph neural network,” J Big Data, vol. 11, p. 22, 2024, https://doi.org/10.1186/s40537-024-00884-y.

[91] M. Belaoued et al, “Deep Learning for Windows Malware Analysis,” Cyber Malware. Security Informatics and Law Enforcement, pp. 119-164, 2024, https://doi.org/10.1007/978-3-031-34969-0_6.

[92] A. Ghaffari et al, “Securing internet of things using machine and deep learning methods: a survey.” Cluster Comput, pp. 1-25, 2024, https://doi.org/10.1007/s10586-024-04509-0.

[93] S. C. Chelgani et al, “CatBoost-SHAP for modeling industrial operational flotation variables – A “conscious lab” approach,” Minerals Engineering, vol. 213, p. 108754, 2024, https://doi.org/10.1016/j.mineng.2024.108754.

[94] X. Feng, J. He, and B. Lu, “Accurate and generalizable soil liquefaction prediction model based on the CatBoost algorithm,” Acta Geophys, vol. 72, pp. 3417–3426, 2024, https://doi.org/10.1007/s11600-024-01381-9.

[95] R. Taherdangkoo et al, “Modeling unsaturated hydraulic conductivity of compacted bentonite using a constrained CatBoost with bootstrap analysis,” Applied Clay Science, vol. 260, p. 107530, 2024, https://doi.org/10.1016/j.clay.2024.107530.

[96] J. T. Hancock, and T. M. Khoshgoftaar, “CatBoost for big data: an interdisciplinary review,” J Big Data, vol. 7, p. 94, 2020, https://doi.org/10.1186/s40537-020-00369-8.

[97] A. A. Ibrahim et al, “Comparison of the CatBoost Classifier with other Machine Learning Methods,” International Journal of Advanced Computer Science and Applications (IJACSA), vol. 11, no. 11, 2020, http://dx.doi.org/10.14569/IJACSA.2020.0111190.

[98] H. Qiu et al, “Prediction of hydrogen storage in metal-organic frameworks using CatBoost-based approach,” International Journal of Hydrogen Energy, vol. 79, pp. 952-961, 2024, https://doi.org/10.1016/j.ijhydene.2024.07.078.

[99] Y. Zhou et al, “Remaining useful life prediction and state of health diagnosis of lithium-ion batteries with multiscale health features based on optimized CatBoost algorithm,” Energy, vol. 300, p. 131575, 2024, https://doi.org/10.1016/j.energy.2024.131575.

[100] X. Wei et al, “Risk assessment of cardiovascular disease based on SOLSSA-CatBoost model,” Expert Systems with Applications, vol. 219, p. 119648, 2023, https://doi.org/10.1016/j.eswa.2023.119648.

[101] B. Dhananjay, and J. Sivaraman, “Analysis and classification of heart rate using CatBoost feature ranking model,” Biomedical Signal Processing and Control, vol. 68, p. 102610, 2021, https://doi.org/10.1016/j.bspc.2021.102610.

[102] S. Aziz et al, “A Framework for Cardiac Arrest Prediction via Application of Ensemble Learning Using Boosting Algorithms,” Procedia Computer Science, vol. 235, pp. 3293-3304, 2024, https://doi.org/10.1016/j.procs.2024.04.311.

[103] Ye. Shiren et al, “Interpretable prediction model for assessing diabetes complication risks in Chinese sufferers,” Diabetes Research and Clinical Practice, vol. 209, p. 111560, 2024, https://doi.org/10.1016/j.diabres.2024.111560.

[104] H. F. Harumy, S. M. Hardi, and M. F. Al Banna, “EarlyStage Diabetes Risk Detection Using Comparison of Xgboost, Lightgbm, and Catboost Algorithms,” In International Conference on Advanced Information Networking and Applications, pp. 12-24, 2024, https://doi.org/10.1007/978-3-031-57931-8_2.

[105] S. K. S. Modak, and V. K. Jha, “Diabetes prediction model using machine learning techniques,” Multimed Tools Appl, vol. 83, pp. 38523–38549, 2024, https://doi.org/10.1007/s11042-023-16745-4.

[106] J. Dhar, and S. Roy, “Identification and diagnosis of cervical cancer using a hybrid feature selection approach with the bayesian optimization-based optimized catboost classification algorithm,” J Ambient Intell Human Comput, vol. 15, pp. 3459–3477, 2024, https://doi.org/10.1007/s12652-024-04825-8.

[107] P. B. Dash et al, “Efficient Ensemble Learning Based CatBoost Approach for Early-Stage Stroke Risk Prediction,” In Ambient Intelligence in Health Care: Proceedings of ICAIHC 2022, pp. 475-483, 2022, https://doi.org/10.1007/978-981-19-6068-0_46.

[108] H. Lu, and X. Hu, “Enhancing Financial Risk Prediction for Listed Companies: A Catboost-Based Ensemble Learning Approach,” J Knowl Econ, vol. 15, pp. 9824–9840, 2024, https://doi.org/10.1007/s13132-023-01601-5.

[109] B. Yu et al, “Risk Assessment of Multi-Hazards in Hangzhou: A Socioeconomic and Risk Mapping Approach Using the CatBoost-SHAP Model,” Int J Disaster Risk Sci, vol. 15, no. 4, pp. 640-656, 2024, https://doi.org/10.1007/s13753-024-00578-2.

[110] X. Wei et al, “Evaluating ensemble learning techniques for stock index trend prediction: a case of China,” Port Econ J, vol. 23, pp. 505–530, 2024, https://doi.org/10.1007/s10258-023-00246-1.

[111] S. Porkodi, and D. Kesavaraja, “Scammer identification using CatBoost in smart contract for enhancing security in blockchain network,” Wireless Netw, vol. 30, pp. 1165–1186, 2024, https://doi.org/10.1007/s11276-023-03552-w.

[112] M. Aguga et al, “Detection of Phishing Websites from URLs Using Hybrid Ensemble-Based Machine Learning Technique,” In International Conference on Soft Computing and Data Mining, pp. 11-22, 2024, https://doi.org/10.1007/978-3-031-66965-1_2.

[113] L. C. M. Liaw et al, “A histogram SMOTE-based sampling algorithm with incremental learning for imbalanced data classification,” Information Sciences, vol. 686, pp. 121193, 2025, https://doi.org/10.1016/j.ins.2024.121193.

[114] S. Zhang, H. Tong, J. Xu, and R. Maciejewki, “Graph convolutional networks: a comprehensive review,” Comput Soc Netw, vol. 6, p. 11, 2019, https://doi.org/10.1186/s40649-019-0069-y.

[115] F. Damoun, H. Seba, and R. State, “Privacy-Preserving Behavioral Anomaly Detection in Dynamic Graphs for Card Transactions.” In International Conference on Web Information Systems Engineering, pp. 286-301, 2024, https://doi.org/10.1007/978-981-96-0576-7_22.

[116] E. A. V. Fabiano, and M. Recamonde-Mendoza, “Prediction of Cancer-Related miRNA Targets Using an Integrative Heterogeneous Graph Neural Network-Based Method.” In Brazilian Conference on Intelligent Systems, pp. 346-360, 2023, https://doi.org/10.1007/978-3-031-45392-2_23.

[117] H. Wang et al, “Feature selection strategies: a comparative analysis of SHAP-value and importance-based methods. J Big Data, vol. 11, p. 44, 2024, https://doi.org/10.1186/s40537-024-00905-w.

[118] A. M. Alsaffar, M. Nouri-Baygi, and H. M. Zolbanin, “Shielding networks: enhancing intrusion detection with hybrid feature selection and stack ensemble learning.” J Big Data, vol. 11, p. 133, 2024, https://doi.org/10.1186/s40537-024-00994-7

Downloads

Published

2024-12-20

How to Cite

[1]
A. S. Sunge, S. W. H. L. Hendric, and D. K. Pramudito, “Using Graph Neural Networks and CatBoost for Internet Security Prediction with SMOTE”, J. Ilm. Tek. Elektro Komput. Dan Inform, vol. 10, no. 4, pp. 747–762, Dec. 2024.

Issue

Section

Articles