Hybrid Approach-RSMOTE for Handling Class Imbalance with Label Noise

Authors

  • Hartono Hartono
  • Erianto Ongko

DOI:

https://doi.org/10.26555/jiteki.v8i3.23684

Abstract

The class imbalance problem is the main problem in classification. This issue arises because real-world datasets frequently exhibit an imbalance as a result of a class with more instances than other classes. In handling class imbalance, a Hybrid Approach that blends data-level and algorithm-level approaches produce good results. However, apart from the class imbalance, which reduces classification accuracy, the complexity of the data also has an effect. The complexity of this data causes a minority noise sample which lies between the minority and the majority. In order to determine how close minority samples are to their homogeneous and heterogeneous nearest neighbors, it is necessary to calculate the relative density. The greater the proximity to the homogeneous nearest neighbors, the greater the relative density, which causes the minority samples to be in a safe state but otherwise be categorized as noisy samples. This research will combine the application of the Hybrid Approach with A self-adaptive Robust SMOTE (RSMOTE), which is an adaptive method from SMOTE that applies the concept of relative density in the over-sampling process on minority samples. The research contribution is to implement the Hybrid Approach-RSMOTE in handling class imbalance with noise by using relative density in over-sampling and also to improve classification performance. The results showed that the Hybrid Approach-RSMOTE and Hybrid Approach-SMOTE had given good results in handling class imbalance. However, the Hybrid Approach-RSMOTE gave better results in the Precision, Recall, F1-Measure, and G-Mean and showed significant differences. Based on the results of the study, it can be stated that the performance of the Hybrid Approach in handling class imbalance is influenced by the selection of the over-sampling method. The results show that RSMOTE can be considered an over-sampling method in the Hybrid Approach.

Downloads

Published

2022-10-13

Issue

Section

Articles