Genetic Algorithm and GloVe for Information Credibility Detection Using Recurrent Neural Networks on Social Media Twitter (X)

Andi Nailul Izzah Ramadhani, Erwin Budi Setiawan

Abstract


Social media, especially X, has become a key source of information for many individuals, but the level of trust in the information spread on these platforms is a critical issue. To overcome this problem, this research proposed an information credibility detection system using a Recurrent Neural Network (RNN) with the utilization of TF-IDF feature extraction, GloVe feature expansion, BERT word embedding, and Genetic Algorithm (GA) optimization. This research contributes to assessing the credibility of tweets related to the 2024 Indonesian election by integrating TF-IDF to identify important words, GloVe to enhance word context, BERT for deeper understanding, and GA is present to optimize RNN performance. The main focus is to provide maximum accuracy by integrating these methods. In this research, the dataset used consists of 54,766 tweets relating to the 2024 Indonesia election and includes relatively equal numbers of credible and non-credible labels. The corpus construction utilized source X with a total of 40,466 data, IndoNews with a total of 131,580, and a combination of both with a total of 150,943. This research conducted six experimental scenarios, namely optimal data split, max features, N-grams, Top-N rank similarity corpus, BERT and GA application. Through these scenarios, the model achieved a significant accuracy improvement of 1.81% over the baseline, reaching an accuracy of 90.60%. This result demonstrates the effectiveness of the proposed system by presenting a higher quality of accuracy compared to the baseline model. Moreover, this research underscores the significant contribution of increasing the accuracy of information credibility detection.

Keywords


BERT; Genetic Algorithm; GloVe; Information Credibility; Recurrent Neural Network; TF-IDF

Full Text:

PDF

References


F. Ahmad and S. A. M. Rizvi, “Information credibility on Twitter using machine learning techniques,” in Communications in computer and information science, pp. 371–381, 2020, https://doi.org/10.1007/978-981-15-4451-4_29.

G. Pennycook and D. G. Rand, “Fighting misinformation on social media using crowdsourced judgments of news source quality,” Proceedings of the National Academy of Sciences of the United States of America, vol. 116, no. 7, pp. 2521–2526, Jan. 2019, https://doi.org/10.1073/pnas.1806781116.

J. Son, J. Lee, O. Oh, H. K. Lee, and J. Woo, “Using a Heuristic-Systematic Model to assess the Twitter user profile’s impact on disaster tweet credibility,” International Journal of Information Management, vol. 54, p. 102176, Oct. 2020, https://doi.org/10.1016/j.ijinfomgt.2020.102176.

E. B. Setiawan, D. H. Widyantoro, and K. Surendro, “Measuring information credibility in social media using combination of user profile and message content dimensions,” International Journal of Power Electronics and Drive Systems/International Journal of Electrical and Computer Engineering, vol. 10, no. 4, p. 3537, Aug. 2020, https://doi.org/10.11591/ijece.v10i4.pp3537-3549.

N. Sitaula, C. K. Mohan, J. Grygiel, X. Zhou, and R. Zafarani, “Credibility-Based fake news detection,” in Lecture notes in social networks, pp. 163–182, 2020, https://doi.org/10.1007/978-3-030-42699-6_9.

S. Sharma, K. Rathore, S. Mittal, and S. Srivastava, "An Analytical Study of Rumoured Tweets by Using Twitter Data," Journal of Web Development and Web Designing, vol. 4, no. 2, pp. 1–9, 2019, https://doi.org/10.5281/zenodo.3066521.

S. N. S. Rajini, K. Anuradha, S. Umadevi, and E. M. Beulah, “Twitter Sentiment Analysis on big data in SPARK Framework,” IOP Conference Series. Materials Science and Engineering, vol. 925, no. 1, p. 012015, Sep. 2020, https://doi.org/10.1088/1757-899x/925/1/012015.

R. Yunanto, A. P. Purfini, and A. Prabuwisesa, “Survei literatur: Deteksi berita palsu menggunakan pendekatan Deep Learning,” Jurnal Manajemen Informatika/Jamika, vol. 11, no. 2, pp. 118–130, Sep. 2021, https://doi.org/10.34010/jamika.v11i2.5362.

M. Azer, M. Taha, H. H. Zayed, and M. Gadallah, “Credibility Detection on Twitter News Using Machine Learning Approach,” International Journal of Intelligent Systems and Applications, vol. 13, no. 3, pp. 1–10, Jun. 2021, https://doi.org/10.5815/ijisa.2021.03.01.

M. A. Kaufhold, M. Bayer, D. Hartung, and C. Reuter, “Design and Evaluation of Deep Learning Models for Real-Time Credibility Assessment in Twitter,” Artificial Neural Networks and Machine Learning – ICANN, pp. 396–408, 2021, https://doi.org/10.1007/978-3-030-86383-8_32.

Z. Cao, X. Ding, and Z. Tao, “News Detection for Recurrent Neural Network Approach,” in 2022 2nd International Conference on Computer Science, Electronic Information Engineering and Intelligent Control Technology (CEI), IEEE, pp. 794–797, Sep. 2022, https://doi.org/10.1109/CEI57409.2022.9950209.

M. K. Hasan and E. B. Setiawan, “Sentiment Analysis of Twitter Data on Bank Central Asia Stocks (BBCA) Using RNN and CNN Model with GloVe Feature Expansion,” 2023 IEEE International Conference on Communication, Networks and Satellite (COMNETSAT), pp. 195-200, Nov. 2023, https://doi.org/10.1109/comnetsat59769.2023.10420731.

S. P. Akula and N. Kamati, “Credibility of Social-Media Content Using Bidirectional Long Short-Term Memory-Recurrent Neural Networks,” in 2021 International Conference on Emerging Techniques in Computational Intelligence (ICETCI), IEEE, pp. 170–175, Aug. 2021, https://doi.org/10.1109/ICETCI51973.2021.9574061.

M. S. David and S. Renjith, “Comparison of word embeddings in text classification based on RNN and CNN,” IOP Conference Series. Materials Science and Engineering, vol. 1187, no. 1, p. 012029, Sep. 2021, https://doi.org/10.1088/1757-899x/1187/1/012029.

A. Aninditya, M. A. Hasibuan, and E. Sutoyo, “Text Mining Approach Using TF-IDF and Naive Bayes for Classification of Exam Questions Based on Cognitive Level of Bloom’s Taxonomy,” 2019 IEEE International Conference on Internet of Things and Intelligence System (IoTaIS), pp. 112-117, Nov. 2019, https://doi.org/10.1109/iotais47347.2019.8980428.

W. Ramadhanti and E. B. Setiawan, “Topic Detection on Twitter Using Deep Learning Method with Feature Expansion GloVe,” JITEKI: Jurnal Ilmiah Teknik Elektro Komputer dan Informatika, vol. 9, no. 3, pp. 780–792, 2023, https://doi.org/10.26555/jiteki.v9i3.26736.

R. Ni and H. Cao, “Sentiment Analysis based on GloVe and LSTM-GRU,” 2020 39th Chinese Control Conference (CCC), pp. 7492-7497,, Jul. 2020, https://doi.org/10.23919/ccc50068.2020.9188578.

C. K. Poetra, S. F. Pane and N. S. Fatonah, "Meningkatkan akurasi long-short term memory (lstm) pada analisis sentimen vaksin covid-19 di twitter dengan glove," Jurnal Telematika, vol. 16, no. 2, pp. 85-90, 2021, https://doi.org/10.61769/telematika.v16i2.400.

P. Vyas and O. F. El-Gayar, “Credibility Analysis of News on Twitter using LSTM: An exploratory study,” AMCIS 2020 Proceedings, Jan. 2020, https://scholar.dsu.edu/bispapers/107/.

S. Sharma, M. Saraswat, and A. K. Dubey, “Fake news detection on Twitter,” International Journal of Web Information Systems, vol. 18, no. 5/6, pp. 388– 412, Dec. 2022, https://doi.org/10.1108/IJWIS-02-2022-0044.

C. Wang, P. Nulty, and D. Lillis, “A Comparative Study on Word Embeddings in Deep Learning for Text Classification,” In Proceedings of the 4th International Conference on Natural Language Processing and Information Retrieval (NLPIR ’20). Association for Computing Machinery, pp. 37-46, Dec. 2020, https://doi.org/10.1145/3443279.3443304.

F. Fatimatuzzahra, R. Hammad, A. Z. Amrullah, and P. Irfan, “Optimasi Neural Network Dengan Menggunakan Algoritma Genetika Untuk Prediksi Jumlah Kunjungan Wisatawan,” JTIM: Jurnal Teknologi Informasi dan Multimedia, vol. 3, no. 4, pp. 227–235, Feb. 2022, https://doi.org/10.35746/jtim.v3i4.190.

H. K. Maragheh, F. S. Gharehchopogh, K. Majidzadeh, and A. B. Sangar, “A New Hybrid Based on Long Short-Term Memory Network with Spotted Hyena Optimization Algorithm for Multi-Label Text Classification,” Mathematics, vol. 10, no. 3, p. 488, Feb. 2022, https://doi.org/10.3390/math10030488.

C. Wintang Kencana, E. B. Setiawan, and I. Kurniawan, “Hoax Detection System on Twitter using Feed-forward and Back-propagation Neural Networks Classification Method,” Jurnal RESTI (Rekayasa Sistem Dan Teknologi Informasi), vol. 4, no. 4, pp. 655–663, 2020, https://doi.org/10.29207/resti.v4i4.2038.

A. Z. R. Adam, and E. B. Setiawan, "Social Media Sentiment Analysis Using Convolutional Neural Network (CNN) and Gated Recurrent Unit (GRU)," JITEKI: Jurnal Ilmiah Teknik Elektro Komputer dan Informatika, vol. 9, no. 1, pp. 119-131, 2024, https://doi.org/10.26555/jiteki.v9i1.25813.

K. U. Wijaya, and E. B. Setiawan, "Hate Speech Detection Using Convolutional Neural Network and Gated Recurrent Unit with FastText Feature Expansion on Twitter," JITEKI: Jurnal Ilmiah Teknik Elektro Komputer dan Informatika, vol. 9, no. 3, pp. 619-631, 2024, https://doi.org/10.26555/jiteki.v9i3.26532.

J. vol. Hartmann, J. Huppertz, C. Schamp, and M. Heitmann, “Comparing automated text classification methods,” International Journal of Research in Marketing, vol. 36, no. 1, pp. 20–38, 2019, https://doi.org/10.1016/j.ijresmar.2018.09.009.

H. Syahputra, and A. Wibowo, "Comparison of Support Vector Machine (SVM) and Random Forest Algorithm for Detection of Negative Content on Websites," JITEKI: Jurnal Ilmiah Teknik Elektro Komputer Dan Informatika, vol. 9, no. 1, pp. 165-173, 2023, https://doi.org/10.26555/jiteki.v9i1.25861.

N. M. Azahra, and E. B. Setiawan, "Sentence-Level Granularity Oriented Sentiment Analysis of Social Media Using Long Short-Term Memory (LSTM) and IndoBERTweet Method," JITEKI: Jurnal Ilmiah Teknik Elektro Komputer Dan Informatika, vol. 9, no. 1, pp. 85-95, 2023. https://doi.org/10.26555/jiteki.v9i1.25765.

S. M. Mohammed, K. Jacksi, and S. R. M. Zeebaree, “A state-of-the-art survey on semantic similarity for document clustering using GloVe and density-based algorithms,” Indonesian Journal of Electrical Engineering and Computer Science, vol. 22, no. 1, p. 552, Apr. 2021, https://doi.org/10.11591/ijeecs.v22.i1.pp552-562.

N. F. Anistya and E. B. Setiawan, “Hate Speech Detection on Twitter in Indonesia with Feature Expansion Using GloVe,” Jurnal RESTI (Rekayasa Sistem Dan Teknologi Informasi), vol. 5, no. 6, pp. 1044–1051, Dec. 2021, https://doi.org/10.29207/resti.v5i6.3521.

M. Kamyab, G. Liu, and M. Adjeisah, “Attention-Based CNN and Bi-LSTM model based on TF-IDF and GLOVE word embedding for sentiment analysis,” Applied Sciences, vol. 11, no. 23, p. 11255, Nov. 2021, https://doi.org/10.3390/app112311255.

W. Harly and A. S. Girsang, “CNN-BERT for measuring agreement between argument in online discussion,” International Journal of Web Information Systems, vol. 18, no. 5/6, pp. 356–368, Dec. 2022, https://doi.org/10.1108/IJWIS-12- 2021-0141.

N. D. T. Putra and N. E. B. Setiawan, “Sentiment Analysis on Social Media with Glove Using Combination CNN and RoBERTa,” Jurnal RESTI (Rekayasa Sistem Dan Teknologi Informasi), vol. 7, no. 3, pp. 457–563, Jun. 2023, https://doi.org/10.29207/resti.v7i3.4892.

N. B. A. C. Martani and E. B. Setiawan, “Naïve Bayes-Support vector machine combined BERT to classified Big Five personality on Twitter,” Jurnal RESTI (Rekayasa Sistem Dan Teknologi Informasi), vol. 6, no. 6, pp. 1072–1078, Dec. 2022, https://doi.org/10.29207/resti.v6i6.4378.

A. S. Alammary, “BERT Models for Arabic Text Classification: A Systematic Review,” Applied Sciences, vol. 12, no. 11, p. 5720, Jun. 2022, https://doi.org/10.3390/app12115720.

I. Banerjee et al., “Comparative effectiveness of convolutional neural network (CNN) and recurrent neural network (RNN) architectures for radiology text report classification,” Artificial Intelligence in Medicine, vol. 97, pp. 79–88, Jun. 2019, https://doi.org/10.1016/j.artmed.2018.11.004.

S. Bansal and N. Baliyan, “Remembering past and predicting future: a hybrid recurrent neural network based recommender system,” J Ambient Intell Humaniz Comput, vol. 14, no. 12, pp. 16025-16036 2022, https://doi.org/10.1007/s12652-022-04375-x.

B. Choe, T. Kang, and K. Jung, “Recommendation System with Hierarchical Recurrent Neural Network for Long-Term Time Series,” IEEE Access, vol. 9, pp. 72033– 72039, 2021, https://doi.org/10.1109/ACCESS.2021.3079922.

B. M. G. A. Awienoor, and E. B. Setiawan, “Movie Recommendation System Based on Tweets Using Switching Hybrid Filtering with Recurrent Neural Network,” International Journal of Intelligent Engineering and Systems, vol. 17, no. 2, pp. 277–293, Apr. 2024, https://doi.org/10.22266/ijies2024.0430.24.

M. A. Saputra and E. B. Setiawan, “Aspect Based Sentiment Analysis Using Recurrent Neural Networks (RNN) on Social Media Twitter,” 2023 International Conference on Data Science and Its Applications (ICoDSA), Aug. 2023, https://doi.org/10.1109/icodsa58501.2023.10276768.

W. K. Sari, D. P. Rini, and R. F. Malik, “Text classification using Long Short-Term memory with GloVE features,” JITEKI: Jurnal Ilmiah Teknik Elektro Komputer Dan Informatika, vol. 5, no. 2, p. 85, Feb. 2020, https://doi.org/10.26555/jiteki.v5i2.15021.

M. S. Islam, M. S. Sultana, U. Kumar, J. A. Mahmud, and S. J. Islam, “HARC-New Hybrid Method with Hierarchical Attention Based Bidirectional Recurrent Neural Network with Dilated Convolutional Neural Network to Recognize Multilabel Emotions from Text,” JITEKI: Jurnal Ilmiah Teknik Elektro Komputer Dan Informatika, vol. 7, no. 1, p. 142, Apr. 2021, https://doi.org/10.26555/jiteki.v7i1.20550.

J. Du, C.-M. Vong, and C. L. P. Chen, “Novel Efficient RNN and LSTM-Like Architectures: recurrent and gated broad learning systems and their applications for text classification,” IEEE Transactions on Cybernetics, vol. 51, no. 3, pp. 1586–1597, Mar. 2021, https://doi.org/10.1109/tcyb.2020.2969705.

M. A. Albadr, S. Tiun, M. Ayob, and F. AL-Dhief, “Genetic Algorithm Based on Natural Selection Theory for Optimization Problems,” Symmetry (Basel), vol. 12, no. 11, p. 1758, Oct. 2020, https://doi.org/10.3390/sym12111758.

T. Alam, S. Qamar, A. Dixit, and M. Benaida, “Genetic Algorithm: reviews, implementations, and applications,” International Journal of Engineering Pedadogy, vol. 10, no. 6, p. 57, Dec. 2020, https://doi.org/10.3991/ijep.v10i6.14567.

A. G. Putrada, N. Alamsyah, I. D. Oktaviani, and M. N. Fauzan, "A Hybrid Genetic Algorithm-Random Forest Regression Method for Optimum Driver Selection in Online Food Delivery," JITEKI: Jurnal Ilmiah Teknik Elektro Komputer Dan Informatika, vol. 9, no. 4, pp. 1060-1079. 2023. https://doi.org/10.26555/jiteki.v9i4.27014.

S. Nikbakht, C. Anitescu, and T. Rabczuk, “Optimizing the neural network hyperparameters utilizing genetic algorithm,” Journal of Zhejiang University. Science A, vol. 22, no. 6, pp. 407–426, Jun. 2021, https://doi.org/10.1631/jzus.a2000384.

M. A. A. Albadr, S. Tiun, M. Ayob, and F. T. Al-Dhief, “Genetic algorithm based on natural selection theory for optimization problems,” Symmetry, vol. 12, no. 11, p. 1758, Oct. 2020, https://doi.org/10.3390/sym12111758.

J. Chai, “Optimizing neural network training with Genetic Algorithms,” Applied and Computational Engineering, vol. 42, no. 1, pp. 220–224, Feb. 2024, https://doi.org/10.54254/2755-2721/42/20230780.

B. Gaye and A. Wulamu, “Sentiment analysis of text classification algorithms using Confusion Matrix,” in Communications in computer and information science, pp. 231–241, 2019, https://doi.org/10.1007/978-981-15-1922-2_16.

M. M. Dakwah, A. A. Firdaus, Furizal, and R. A. Faresta, "Sentiment Analysis on Marketplace in Indonesia using Support Vector Machine and Naïve Bayes Method," JITEKI: Jurnal Ilmiah Teknik Elektro Komputer Dan Informatika, vol. 10, no. 1, pp. 39-53, 2024. https://dx.doi.org/10.26555/jiteki.v10i1.28070.

M. S. Islam, S. Sultana, U. K. Roy, and J. A. Mahmud, "A review on Video Classification with Methods, Findings, Performance, Challenges, Limitations and Future Work," JITEKI: Jurnal Ilmiah Teknik Elektro Komputer Dan Informatika, vol. 6, no. 2, pp. 47-57, 2020. https://dx.doi.org/10.26555/jiteki.v6i2.18978.




DOI: http://dx.doi.org/10.26555/jiteki.v10i2.29185

Refbacks

  • There are currently no refbacks.


Copyright (c) 2024 Andi Nailul Izzah Ramadhani, Erwin Budi Setiawan

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.


 
About the JournalJournal PoliciesAuthor Information
 


Jurnal Ilmiah Teknik Elektro Komputer dan Informatika
ISSN 2338-3070 (print) | 2338-3062 (online)
Organized by Electrical Engineering Department - Universitas Ahmad Dahlan
Published by Universitas Ahmad Dahlan
Website: http://journal.uad.ac.id/index.php/jiteki
Email 1: jiteki@ee.uad.ac.id
Email 2: alfianmaarif@ee.uad.ac.id
Office Address: Kantor Program Studi Teknik Elektro, Lantai 6 Sayap Barat, Kampus 4 UAD, Jl. Ringroad Selatan, Tamanan, Kec. Banguntapan, Bantul, Daerah Istimewa Yogyakarta 55191, Indonesia