From Text to Truth: Leveraging IndoBERT and Machine Learning Models for Hoax Detection in Indonesian News

Muhammad Yusuf Ridho; Evi Yulianti

doi:10.26555/jiteki.v10i3.29450

From Text to Truth: Leveraging IndoBERT and Machine Learning Models for Hoax Detection in Indonesian News

Authors

Muhammad Yusuf Ridho Universitas Indonesia
Evi Yulianti Universitas Indonesia

DOI:

https://doi.org/10.26555/jiteki.v10i3.29450

Keywords:

IndoBERT, Fake news detection, Indonesian News Dataset, Machine Learning, Natural Language Processing, Oversampling-SMOTE, Text Classification, Deep Learning, Comparative model

Abstract

In the era of technology and information exchange online content being deceitful poses a serious threat to public trust and social harmony on a global scale. Detective mechanisms to identify content are essential for safeguard the populace effectively. This study is dedicated to creating a machine learning system that can automatically spot deceptive content in Indonesian language by utilizing IndoBERT. A model specifically tailored for the intricacies of the Indonesian language. IndoBERT was selected due to its capacity to grasp the linguistic nuances present, in Indonesian text which are often challenging for other models built upon the BERT framework. The key focus of this study lies in conducting an assessment of the IndoBERT model in relation to other approaches used in past research for identifying fake news like CNN LSTM and various classification models such as Logistic Regression and Naïve Bayes among others. To address the issue of imbalanced data between valid labels in fake news detection tasks we employed the SMOTE oversampling technique, for data augmentation and balancing purposes. The dataset employed consists of Indonesian language news articles publicly available and categorized as either hoax or valid following assessment by three judges voting system. IndoBERT Large demonstrated performance by achieving an accuracy rate of 98% outperform the original datasets 92% when tested on the oversampled dataset. Utilizing the SMOTE oversampling technique aided in data balance and enhancing the models performance. These outcomes highlight IndoBERTs capabilities in detecting fake news and pave the way for its potential integration, into real world scenarios.

Author Biographies

Muhammad Yusuf Ridho, Universitas Indonesia

The Faculty of Computer Science at Universitas Indonesia

Evi Yulianti, Universitas Indonesia

Evi Yulianti is a lecturer and researcher at Fasilkom UI.

Education:
Ph.D (Computer Science, Royal Melbourne Institute of Technology)
M.Comp.Sc. (Computer Science, Royal Melbourne Institute of Technology)
M.Kom. (Computer Science, University of Indonesia)
S.Kom. (Computer Science, University of Indonesia)

Downloads

Published

2024-09-10

How to Cite

[1]

M. Y. Ridho and E. Yulianti, “From Text to Truth: Leveraging IndoBERT and Machine Learning Models for Hoax Detection in Indonesian News”, J. Ilm. Tek. Elektro Komput. Dan Inform, vol. 10, no. 3, pp. 544–555, Sep. 2024.

Download Citation

Issue

Vol. 10 No. 3 (2024): September

Section

Articles

License

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

Authors who publish with JITEKI agree to the following terms:

Authors retain copyright and grant the journal the right of first publication with the work simultaneously licensed under a Creative Commons Attribution License (CC BY-SA 4.0) that allows others to share the work with an acknowledgment of the work's authorship and initial publication in this journal.
Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgment of its initial publication in this journal.
Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work.

This work is licensed under a Creative Commons Attribution 4.0 International License

About the Journal	Journal Policies	Author	Information
Focus and Scope Editorial Board Reviewer Open Access Policy Sponsorships Contact Us Google Scholar Most Cited Paper	Publication Ethics Peer Review Process Review Guideline Archiving Advertising	Author Guidelines Online Submission Publication Charge / Fee Plagiarism Policy Article Withdrawal	For Readers For Authors Journal History For Editor For Reviewer

From Text to Truth: Leveraging IndoBERT and Machine Learning Models for Hoax Detection in Indonesian News

Authors

DOI:

Keywords:

Abstract

Author Biographies

Muhammad Yusuf Ridho, Universitas Indonesia

Evi Yulianti, Universitas Indonesia

Downloads

Published

How to Cite

Issue

Section

License

Similar Articles

Most read articles by the same author(s)

special_links

journal_metrics

current_indexing

journal_template_2

Make a Submission

sinta_certificate

visitor_country

visitors

Information