Enhancing Refactoring Prediction at the Method-Level Using Stacking and Boosting Models
DOI:
https://doi.org/10.26555/jiteki.v11i2.30839Keywords:
Software Refactoring, Method-Level Refactoring, Meta-Learning, Code Smell Detection, Software Maintainability, Swarm Optimization Algorithm, Lion Optimization Algorithm, Stacking Algorithm, Boosting AlgorithmAbstract
Refactoring software code is crucial for developers since it enhances code maintainability and decreases technical complexity. The existing manual approach to refactoring demonstrates restricted scalability because of its requirement for substantial human intervention and big training information. A method-level refactoring prediction technique based on meta-learning uses classifier stacking and boosting and Lion Optimization Algorithm (LOA) for feature selection. The evaluation of the proposed model used four Java open source projects namely JUnit, McMMO, MapDB, and ANTLR4 showing exceptional predictive results. The technique successfully decreased training data necessities by 30% yet generated better prediction results by 10–15% above typical models to deliver 100% accuracy and F1 scores on DTS3 and DTS4 datasets. The system decreased incorrect refactoring alert counts by 40% which lowered the amount of needed developer examination.
References
[1] N. Moaid Edan, S. Mahmood Hadeed, and S. A. Mahmood, “Using Software Engineering to Design and Implement Mixer of Multiple Cameras, Microphone, and Screens.” International Journal of Applied Sciences and Technology, vol. 4, no. 2, 2022, https://www.minarjournal.com/dergi/using-software-engineering-to-design-and-implement-mixer-of-multiple-cameras-microphone-and-screens20220702024845.pdf.
[2] G. A. Armijo and V. V De Camargo, “Refactoring Recommendations with Machine Learning,” Simpósio Brasileiro de Qualidade de Software (SBQS), pp. 15-22, 2022, https://doi.org/10.5753/sbqs_estendido.2022.227650.
[3] S. I. Khaleel and G. K. Al-Khatouni, “A literature review for measuring maintainability of code clone,” Indonesian Journal of Electrical Engineering and Computer Science, vol. 31, no. 2, pp. 1118–1127, Aug. 2023, https://doi.org/10.11591/ijeecs.v31.i2.pp1118-1127.
[4] D. Dig et al., “Refactoring tools,” in Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), pp. 193–202, 2008, https://doi.org/10.1007/978-3-540-78195-0_19.
[5] S. I. Khaleel and R. Anan, “A review paper: optimal test cases for regression testing using artificial intelligent techniques,” International Journal of Electrical and Computer Engineering, vol. 13, no. 2, pp. 1803–1816, Apr. 2023, https://doi.org/10.11591/ijece.v13i2.pp1803-1816.
[6] M. mzeri and L. Ibrahim, “Detecting A Medical Mask During The COVID-19 Pandemic Using Machine Learning: A Review Study,” Journal of Education and Science, vol. 31, no. 2, pp. 55–68, Jun. 2022, https://doi.org/10.33899/edusj.2022.133181.1221.
[7] A. Ali and N. Nimat Saleem, “Classification of Software Systems attributes based on quality factors using linguistic knowledge and machine learning: A review,” Journal of Education and Science, vol. 31, no. 3, pp. 66–90, Sep. 2022, https://doi.org/10.33899/edusj.2022.134024.1245.
[8] S. I. Khaleel and L. F. Salih, “A survey of predicting software reliability using machine learning methods,” IAES International Journal of Artificial Intelligence, vol. 13, no. 1, pp. 35–44, Mar. 2024, https://doi.org/10.11591/ijai.v13.i1.pp35-44.
[9] H. Khosravi and A. Rasoolzadegan, “A Meta-Learning Approach for Software Refactoring,” arXiv preprint arXiv:2301.08061, 2023, https://doi.org/10.48550/arXiv.2301.08061.
[10] L. Kumar and A. Sureka, “Application of LSSVM and SMOTE on Seven OpenSource Projects for Predicting Refactoring at Class Level,” in Proceedings - Asia-Pacific Software Engineering Conference, APSEC, pp. 90–99, 2017, https://doi.org/10.1109/APSEC.2017.15.
[11] L. Kumar, S. M. Satapathy, and L. B. Murthy, “Method level refactoring prediction on five open source Java projects using machine learning techniques,” in ACM International Conference Proceeding Series, pp. 1-10, 2019. https://doi.org/10.1145/3299771.3299777.
[12] M. Aniche, E. Maziero, R. Durelli, and V. Durelli, “The Effectiveness of Supervised Machine Learning Algorithms in Predicting Software Refactoring,” IEEE Transactions on Software Engineering, vol. 48, no. 4, pp. 1432-1450, Jan. 2020, [Online]. Available: http://arxiv.org/abs/2001.03338.
[13] M. Akour, M. Alenezi, and H. Alsghaier, “Software Refactoring Prediction Using SVM and Optimization Algorithms,” Processes, vol. 10, no. 8, Aug. 2022, https://doi.org/10.3390/pr10081611.
[14] E. Zabardast, J. Gonzalez-Huerta, and D. Smite, “Refactoring, Bug Fixing, and New Development Effect on Technical Debt: An Industrial Case Study,” in Proceedings - 46th Euromicro Conference on Software Engineering and Advanced Applications, SEAA., pp. 376–384, Aug. 2020, https://doi.org/10.1109/SEAA51224.2020.00068.
[15] S. I. Khaleel and R. A. Mahmood, “Automatic Software Refactoring to Enhance Quality: A Review,” Jurnal Ilmiah Teknik Elektro Komputer dan Informatika (JITEKI), vol. 10, no. 4, pp. 734–746, 2024, https://doi.org/10.26555/jiteki.v10i4.30277.
[16] M. M. A. Dabdawb and B. Mahmood, “On the relations among object-oriented software metrics: A network-based approach,” International Journal of Computing and Digital Systems, vol. 10, no. 1, pp. 901–915, 2021, https://doi.org/10.12785/ijcds/100182.
[17] M. Rostami, K. Berahmand, E. Nasiri, and S. Forouzande, “Review of swarm intelligence-based feature selection methods,” Engineering Applications of Artificial Intelligence, vol. 100, p. 104210, 2021, https://doi.org/10.1016/j.engappai.2021.104210.
[18] X. Li, M. Clerc, “Swarm intelligence,” In Handbook of Metaheuristics, Springer International Publishing, pp. 353-384, 2018, https://doi.org/10.1007/978-3-319-91086-4_11.
[19] B. Khaleel, “A Review Of Clustering Methods Based on Artificial Intelligent Techniques,” Journal Of Education And Science, vol. 31, no. 2, pp. 69–82, Jun. 2022, https://doi.org/10.33899/edusj.2022.133092.1218.
[20] L. Brezočnik, I. Fister, and V. Podgorelec, “Swarm intelligence algorithms for feature selection: A review,” Applied Sciences, vol. 8, no. 9, p.1521, 2018, https://doi.org/10.3390/app8091521.
[21] H. Ahmed and J. Glasgow, “Swarm Intelligence: Concepts, Models and Applications,” 2012, https://research.cs.queensu.ca/TechReports/Reports/2012-585.pdf.
[22] S. M. Almufti, “Lion algorithm: Overview, modifications and applications E I N F O,” International Research Journal of Science, vol. 2, no. 2, pp. 176–186, 2022, https://doi.org/10.5281/zenodo.6973555.
[23] M. Khosravy, N. Gupta, N. Patel, and T. Senjyu, “Frontier Applications of Nature Inspired Computation,” Springer, 2020, [Online]. Available: http://www.springer.com/series/16134.
[24] B. R. Rajakumar, “The Lion’s Algorithm: A New Nature-Inspired Search Algorithm,” Procedia Technology, vol. 6, pp. 126–135, 2012, https://doi.org/10.1016/j.protcy.2012.10.016.
[25] K. Geetha, V. Anitha, M. Elhoseny, S. Kathiresan, P. Shamsolmoali, and M. M. Selim, “An evolutionary lion optimization algorithm-based image compression technique for biomedical applications,” in Expert Systems, vol. 38, no. 1, p. e12508, 2021, https://doi.org/10.1111/exsy.12508.
[26] Y. Zhang, J. Liu, and W. Shen, “A Review of Ensemble Learning Algorithms Used in Remote Sensing Applications,” Applied Sciences, vol. 12, no. 17, p. 8654, 2022, https://doi.org/10.3390/app12178654.
[27] I. D. Mienye and Y. Sun, “A Survey of Ensemble Learning: Concepts, Algorithms, Applications, and Prospects,” IEEE Access, vol. 10, pp. 99129-99149, 2022, https://doi.org/10.1109/ACCESS.2022.3207287.
[28] O. Sagi and L. Rokach, “Ensemble learning: A survey,” Wiley interdisciplinary reviews: data mining and knowledge discovery, vol. 8, no. 4, p. e1249, 2018, https://doi.org/10.1002/widm.1249.
[29] J. Daniel and J. H. Martin, “Speech and Language Processing,” Power Point Slides, 2024, https://web.stanford.edu/~jurafsky/slp3/.
[30] S. Nusinovici et al., “Logistic regression was as good as machine learning for predicting major chronic diseases,” J Clin Epidemiol, vol. 122, pp. 56–69, Jun. 2020, https://doi.org/10.1016/j.jclinepi.2020.03.002.
[31] J. Sun, L. Xu, “Cloud-based adaptive quantum genetic algorithm for solving flexible job shop scheduling problem,” IEEE 7th International Conference on Computer Science and Network Technology (ICCSNT), pp. 1-5, 2019, https://doi.org/10.1109/ICCSNT47585.2019.8962476.
[32] Z. A. Ali, Z. H. Abduljabbar, H. A. Tahir, A. Bibo Sallow, and S. M. Almufti, “eXtreme Gradient Boosting Algorithm with Machine Learning: a Review,” Academic Journal of Nawroz University, vol. 12, no. 2, pp. 320–334, May 2023, https://doi.org/10.25007/ajnu.v12n2a1612.
[33] G. Martínez-Muñoz, C. Bentéjac, and A. Csörg˝ O B Gonzalo Martínez-Muñoz, “A Comparative Analysis of XGBoost,” Artificial Intelligence Review, vol. 54, pp. 1937-1967, 2021, https://doi.org/10.48550/arXiv.1911.01914.
[34] H. Sharma, H. Harsora, and B. Ogunleye, “An Optimal House Price Prediction Algorithm: XGBoost,” Analytics, vol. 3, no. 1, pp. 30–45, Jan. 2024, https://doi.org/10.3390/analytics3010003.
[35] F. Sahin, J. S. Bay, C. A. Lynn, A. Hugh, and F. Vanlandingham, “A Radial Basis Function Approach to a Color Image Classification Problem in a Real Time Industrial Application,” Doctoral dissertation, Virginia Tech, 1997, https://vtechworks.lib.vt.edu/items/03d6d0e9-8a9d-462f-a183-ee5953920c3f.
[36] G. A. Montazer, D. Giveki, M. Karami, and H. Rastegar, “Radial Basis Function Neural Networks: A Review,” Comput. Rev. J, vol. 1, no. 1, pp. 52-74, 2018, http://purkh.com/index.php/tocomp.
[37] T. Oommen, D. Misra, N. K. C. Twarakavi, A. Prakash, B. Sahoo, and S. Bandopadhyay, “An objective analysis of support vector machine based classification for remote sensing,” Math Geosci, vol. 40, no. 4, pp. 409–424, 2008, https://doi.org/10.1007/s11004-008-9156-6.
[38] A. Robles-Velasco, P. Cortés, J. Muñuzuri, and L. Onieva, “Prediction of pipe failures in water supply networks using logistic regression and support vector classification,” Reliab Eng Syst Saf, vol. 196, Apr. 2020, https://doi.org/10.1016/j.ress.2019.106754.
[39] J. Yin and Q. Li, “A semismooth Newton method for support vector classification and regression,” Comput Optim Appl, vol. 73, no. 2, pp. 477–508, Jun. 2019, https://doi.org/10.1007/s10589-019-00075-z.
[40] N. Mungoli, “Adaptive Ensemble Learning: Boosting Model Performance through Intelligent Feature Fusion in Deep Neural Networks,” arXiv preprint arXiv:2304.02653, 2023, [Online]. Available: http://arxiv.org/abs/2304.02653
[41] P. Bühlmann, “Bagging, Boosting and Ensemble Methods,” in Handbook of Computational Statistics, Springer Berlin Heidelberg, pp. 985–1022, 2012, https://doi.org/10.1007/978-3-642-21551-3_33.
[42] P. Florek and A. Zagdański, “Benchmarking state-of-the-art gradient boosting algorithms for classification,” arXiv preprint arXiv:2305.17094, 2023, [Online]. Available: http://arxiv.org/abs/2305.17094.
[43] A. A. Ibrahim, R. L. Ridwan, M. M. Muhammed, R. O. Abdulaziz, and G. A. Saheed, “Comparison of the CatBoost Classifier with other Machine Learning Methods,” International Journal of Advanced Computer Science and Applications, vol. 11, no. 11, pp. 738-748, 2020. [Online]. Available: www.ijacsa.thesai.org.
[44] H. Wang and L. Cheng, “CatBoost model with synthetic features in application to loan risk assessment of small businesses,” arXiv preprint arXiv:2106.07954, 2021, [Online]. Available: http://arxiv.org/abs/2106.07954.
[45] K. W. Walker, “Exploring adaptive boosting (AdaBoost) as a platform for the predictive modeling of tangible collection usage,” Journal of Academic Librarianship, vol. 47, no. 6, Dec. 2021, https://doi.org/10.1016/j.acalib.2021.102450.
[46] C. Tu, H. Liu, and B. Xu, “AdaBoost typical Algorithm and its application research,” in MATEC Web of Conferences, EDP Sciences, Dec. 2017. https://doi.org/10.1051/matecconf/201713900222.
[47] M. Al-kasassbeh, M. A. Abbadi, and A. M. Al-Bustanji, “LightGBM Algorithm for Malware Detection,” in Advances in Intelligent Systems and Computing, pp. 391–403, 2020, https://doi.org/10.1007/978-3-030-52243-8_28.
[48] Y. Ju, G. Sun, Q. Chen, M. Zhang, H. Zhu, and M. U. Rehman, “A model combining convolutional neural network and lightgbm algorithm for ultra-short-term wind power forecasting,” IEEE Access, vol. 7, pp. 28309–28318, 2019, https://doi.org/10.1109/ACCESS.2019.2901920.
[49] L. Göcs and Z. Csaba Johanyák, “Catboost Algorithm Based Classifier Module For Brute Force Attack Detection,” Annals of the Faculty of Engineering Hunedoara-International Journal of Engineering, vol. 21, no. 3, 2023, https://gocslaszlo.hu/phd/tezis_4.pdf.
[50] H. Wang et al., “CatBoost-Based Framework for Intelligent Prediction and Reaction Condition Analysis of Coupling Reaction,” Match, vol. 90, no. 1, pp. 53–71, 2023, https://doi.org/10.46793/match.90-1.053W.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 Rasha Ahmed

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Authors who publish with JITEKI agree to the following terms:
- Authors retain copyright and grant the journal the right of first publication with the work simultaneously licensed under a Creative Commons Attribution License (CC BY-SA 4.0) that allows others to share the work with an acknowledgment of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgment of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work.
This work is licensed under a Creative Commons Attribution 4.0 International License