Effect of SMOTE Variants on Software Defect Prediction Classification Based on Boosting Algorithm

Authors

  • Rahmina Ulfah Aflaha Lambung Mangkurat University
  • Rudy Herteno Lambung Mangkurat University
  • Mohammad Reza Faisal Lambung Mangkurat University
  • Friska Abadi Lambung Mangkurat University
  • Setyo Wahyu Saputro Lambung Mangkurat University

DOI:

https://doi.org/10.26555/jiteki.v10i2.28521

Keywords:

Software Defect Prediction, Imbalance, SMOTE Variants, Boosting

Abstract

Detecting software defects early on is critical for avoiding significant financial losses. However, building accurate software defect prediction models can be challenging due to class imbalance, where the data for defective modules is much less than for standard modules. This research addresses this issue using the imbalanced dataset NASA MDP. To address this issue, researchers have proposed new methods that combine data level balancing approaches with 14 variations of the SMOTE algorithm to increase the amount of defective module data. An algorithm-level approach with three boosting algorithms, Catboost, LightGBM, and Gradient Boosting, is applied to classify modules as defective or non-defective. These methods aim to improve the accuracy of software defect prediction. The results show that this new method can produce a more accurate classification than previous studies. The DSMOTE and Gradient Boosting pair with 0.9161 has the highest average accuracy (0.9161). The DSMOTE and Catboost model achieved the highest average AUC value (0.9637). The ADASYN kernel and Catboost showed the best ability to perform the average G-mean value (0.9154). The research contribution to software defect prediction involves developing new techniques and evaluating their effectiveness in addressing class imbalance.

Author Biographies

Rahmina Ulfah Aflaha, Lambung Mangkurat University

Computer Science Department

Rudy Herteno, Lambung Mangkurat University

Computer Science Department

Mohammad Reza Faisal, Lambung Mangkurat University

Computer Science Department

Friska Abadi, Lambung Mangkurat University

Computer Science Department

Setyo Wahyu Saputro, Lambung Mangkurat University

Computer Science Department

Downloads

Published

2024-06-21

How to Cite

[1]
R. U. Aflaha, R. Herteno, M. R. Faisal, F. Abadi, and S. W. Saputro, “Effect of SMOTE Variants on Software Defect Prediction Classification Based on Boosting Algorithm”, J. Ilm. Tek. Elektro Komput. Dan Inform, vol. 10, no. 2, pp. 201–216, Jun. 2024.

Issue

Section

Articles

Similar Articles

You may also start an advanced similarity search for this article.