Effect of SMOTE Variants on Software Defect Prediction Classification Based on Boosting Algorithm

Rahmina Ulfah Aflaha, Rudy Herteno, Mohammad Reza Faisal, Friska Abadi, Setyo Wahyu Saputro

Abstract


Detecting software defects early on is critical for avoiding significant financial losses. However, building accurate software defect prediction models can be challenging due to class imbalance, where the data for defective modules is much less than for standard modules. This research addresses this issue using the imbalanced dataset NASA MDP. To address this issue, researchers have proposed new methods that combine data level balancing approaches with 14 variations of the SMOTE algorithm to increase the amount of defective module data. An algorithm-level approach with three boosting algorithms, Catboost, LightGBM, and Gradient Boosting, is applied to classify modules as defective or non-defective. These methods aim to improve the accuracy of software defect prediction. The results show that this new method can produce a more accurate classification than previous studies. The DSMOTE and Gradient Boosting pair with 0.9161 has the highest average accuracy (0.9161). The DSMOTE and Catboost model achieved the highest average AUC value (0.9637). The ADASYN kernel and Catboost showed the best ability to perform the average G-mean value (0.9154). The research contribution to software defect prediction involves developing new techniques and evaluating their effectiveness in addressing class imbalance.

Keywords


Software Defect Prediction; Imbalance; SMOTE Variants; Boosting

Full Text:

PDF


DOI: http://dx.doi.org/10.26555/jiteki.v10i2.28521

Refbacks

  • There are currently no refbacks.


Copyright (c) 2024 Rahmina Ulfah Aflaha, Rudy Herteno, Rudy Herteno, Rudy Herteno, Mohammad Reza Faisal, Mohammad Reza Faisal, Mohammad Reza Faisal, Friska Abadi, Friska Abadi, Friska Abadi, Setyo Wahyu Saputro, Setyo Wahyu Saputro, Setyo Wahyu Saputro

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.


 
About the JournalJournal PoliciesAuthor Information
 


Jurnal Ilmiah Teknik Elektro Komputer dan Informatika
ISSN 2338-3070 (print) | 2338-3062 (online)
Organized by Electrical Engineering Department - Universitas Ahmad Dahlan
Published by Universitas Ahmad Dahlan
Website: http://journal.uad.ac.id/index.php/jiteki
Email 1: jiteki@ee.uad.ac.id
Email 2: alfianmaarif@ee.uad.ac.id
Office Address: Kantor Program Studi Teknik Elektro, Lantai 6 Sayap Barat, Kampus 4 UAD, Jl. Ringroad Selatan, Tamanan, Kec. Banguntapan, Bantul, Daerah Istimewa Yogyakarta 55191, Indonesia