Word Embedding Feature for Improvement Machine Learning Performance in Sentiment Analysis Disney Plus Hotstar Comments

Jasmir Jasmir, Nurhadi Nurhadi, Eni Rohaini, M Riza Pahlevi B, Daniel Sintong Pardamean Simanjuntak

Abstract


In this research we apply several machine learning methods and word embedding features to process social media data, specifically comments on the Disney Plus Hotstar application. The word embedding features used include Word2Vec, GloVe, and FastText. Our aim is to evaluate the impact of these features on the classification performance of machine learning methods such as Naive Bayes (NB), K-Nearest Neighbor (KNN), and Random Forest (RF). NB is very simple and efficient and very sensitive to feature selection. Meanwhile, KNN is known for its weaknesses such as biased k values, overly complex computations, memory limitations, and ignoring irrelevant attributes. Then RF has a weakness, namely that the evaluation value can change significantly with just a slight change in the data. Feature selection in text classification is crucial for enhancing scalability, efficiency, and accuracy. Our testing results indicate that KNN achieved the highest accuracy both before and after feature selection. The FastText feature led to the highest performance for KNN, yielding balanced accuracy, precision, recall, and F1-score values.

Keywords


Text Classification; Machine Learning Evaluation; Word Embedding; Sentiment Analysis; Social Media Analysis

Full Text:

PDF


DOI: http://dx.doi.org/10.26555/jiteki.v10i2.28799

Refbacks

  • There are currently no refbacks.


Copyright (c) 2024 Jasmir Jasmir, Nurhadi Nurhadi, Eni Rohaini, M Riza Pahlevi B, Daniel Sintong Pardamean Simanjuntak

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.


 
About the JournalJournal PoliciesAuthor Information
 


Jurnal Ilmiah Teknik Elektro Komputer dan Informatika
ISSN 2338-3070 (print) | 2338-3062 (online)
Organized by Electrical Engineering Department - Universitas Ahmad Dahlan
Published by Universitas Ahmad Dahlan
Website: http://journal.uad.ac.id/index.php/jiteki
Email 1: jiteki@ee.uad.ac.id
Email 2: alfianmaarif@ee.uad.ac.id
Office Address: Kantor Program Studi Teknik Elektro, Lantai 6 Sayap Barat, Kampus 4 UAD, Jl. Ringroad Selatan, Tamanan, Kec. Banguntapan, Bantul, Daerah Istimewa Yogyakarta 55191, Indonesia