Topic Detection on Twitter Using Deep Learning Method with Feature Expansion GloVe
DOI:
https://doi.org/10.26555/jiteki.v9i3.26736Keywords:
Topic Classification, Twitter, Tweet, CNN, RNN, GloVeAbstract
Twitter is a medium of communication, transmission of information, and exchange of opinions on a topic with an extensive reach. Twitter has a tweet with a text message of 280 characters. Because text messages can only be written briefly, tweets often use slang and may not follow structured grammar. The diverse vocabulary in tweets leads to word discrepancies, so tweets are difficult to understand. The problem often found in classifying topics in tweets is that they need higher accuracy due to these factors. Therefore, the authors used the GloVe feature expansion to reduce vocabulary discrepancies by building a corpus from Twitter and IndoNews. Research on the classification of topics in previous tweets has been done extensively with various Machine Learning or Deep Learning methods using feature expansion. However, To the best of our knowledge, Hybrid Deep Learning has not been previously used for topic classification on Twitter. Therefore, the study conducted experiments to analyze the impact of Hybrid Deep Learning and the expansion of GloVe features on classification topics. The total data used in this study was 55,411 datasets in Indonesian-language text. The methods used in this study are Convolutional Neural Network (CNN), Recurrent Neural Network (RNN), and Hybrid CNN-RNN. The results show that the topic classification system with GloVe feature expansion using the CNN method achieved the highest accuracy of 92.80%, with an increase of 0.40% compared to the baseline. The RNN followed it with an accuracy of 93.72% and a 0.23% improvement. The CNN-RN Hybrid Deep Learning model achieved the highest accuracy of 94.56%, with a significant increase of 2.30%. The RNN-CNN model also achieved high accuracy, reaching 94.39% with a 0.95% increase. Based on the accuracy results, the Hybrid Deep Learning model, with the addition of feature expansion, significantly improved the system's performance, resulting in higher accuracy.
Downloads
Published
How to Cite
Issue
Section
License
Authors who publish with JITEKI agree to the following terms:
- Authors retain copyright and grant the journal the right of first publication with the work simultaneously licensed under a Creative Commons Attribution License (CC BY-SA 4.0) that allows others to share the work with an acknowledgment of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgment of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work.
This work is licensed under a Creative Commons Attribution 4.0 International License