Comparative Evaluation of Feature Selection Methods for Heart Disease Classification with Support Vector Machine

Winarsi J. Bidul, Sugiyarto Surono, Tri Basuki Kurniawan

Abstract


The purpose of this study is to compare the effectiveness of a variety of feature selection techniques to enhance the performance of Support Vector Machine (SVM) models for classifying heart disease data, particularly in the context of big data. The main challenge lies in managing large datasets, which necessitates the application of feature selection techniques to streamline the analysis process. Therefore, several feature selection methods, including Logistic Regression-Recursive Feature Elimination (LR-RFE), Logistic RegressionSequential Forward Selection (LR-SFS), Correlation-based Feature Selection (CFS), and Variance Threshold were explored to identify the most efficient approach. Based on existing research, these methods have shown a great impact in improving classification accuracy. In this study, it was found that combining the SVM model with LR-RFE, LR-SFS, and Variance Threshold resulted in superior evaluation, achieving the highest accuracy of 89%. Based on the comparison of other evaluation results, including precision, recall, and F1-score, the performance of these models varied depending on the feature selection method chosen and the distribution of data used for training and testing. But in general, LR-RFE-SVM and Variance Threshold-SVM tend to provide better evaluation values than LR-SFS-SVM and SVM-CFS. Based on the computation time, SVM classification with the Variance Threshold method as the feature selection method obtained the fastest time of 118.1540 seconds with the number and retention of 23 important features. Therefore, it is very important to choose a suitable feature selection technique, taking into account the number of retained features and the computation time. This research underscores the significance of feature selection in addressing big data challenges, particularly in heart disease classification. In addition, this study also highlights practical implications for healthcare practitioners and researchers by recommending methods that can be integrated into real-world healthcare settings or existing clinical decision support systems.

Keywords


Big Data;Feature Selection;Classification;Heart disease

Full Text:

PDF


DOI: http://dx.doi.org/10.26555/jiteki.v10i2.28647

Refbacks

  • There are currently no refbacks.


Copyright (c) 2024 Winarsi J. Bidul, Sugiyarto Surono, Tri Basuki Kurniawan

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.


 
About the JournalJournal PoliciesAuthor Information
 


Jurnal Ilmiah Teknik Elektro Komputer dan Informatika
ISSN 2338-3070 (print) | 2338-3062 (online)
Organized by Electrical Engineering Department - Universitas Ahmad Dahlan
Published by Universitas Ahmad Dahlan
Website: http://journal.uad.ac.id/index.php/jiteki
Email 1: jiteki@ee.uad.ac.id
Email 2: alfianmaarif@ee.uad.ac.id
Office Address: Kantor Program Studi Teknik Elektro, Lantai 6 Sayap Barat, Kampus 4 UAD, Jl. Ringroad Selatan, Tamanan, Kec. Banguntapan, Bantul, Daerah Istimewa Yogyakarta 55191, Indonesia