From Text to Truth: Leveraging IndoBERT and Machine Learning Models for Hoax Detection in Indonesian News

Authors

  • Muhammad Yusuf Ridho Universitas Indonesia
  • Evi Yulianti Universitas Indonesia

DOI:

https://doi.org/10.26555/jiteki.v10i3.29450

Keywords:

IndoBERT, Fake news detection, Indonesian News Dataset, Machine Learning, Natural Language Processing, Oversampling-SMOTE, Text Classification, Deep Learning, Comparative model

Abstract

In the era of technology and information exchange online content being deceitful poses a serious threat to public trust and social harmony on a global scale. Detective mechanisms to identify content are essential for safeguard the populace effectively. This study is dedicated to creating a machine learning system that can automatically spot deceptive content in Indonesian language by utilizing IndoBERT. A model specifically tailored for the intricacies of the Indonesian language. IndoBERT was selected due to its capacity to grasp the linguistic nuances present, in Indonesian text which are often challenging for other models built upon the BERT framework. The key focus of this study lies in conducting an assessment of the IndoBERT model in relation to other approaches used in past research for identifying fake news like CNN LSTM and various classification models such as Logistic Regression and Naïve Bayes among others. To address the issue of imbalanced data between valid labels in fake news detection tasks we employed the SMOTE oversampling technique, for data augmentation and balancing purposes. The dataset employed consists of Indonesian language news articles publicly available and categorized as either hoax or valid following assessment by three judges voting system. IndoBERT Large demonstrated performance by achieving an accuracy rate of 98% outperform the original datasets 92% when tested on the oversampled dataset. Utilizing the SMOTE oversampling technique aided in data balance and enhancing the models performance. These outcomes highlight IndoBERTs capabilities in detecting fake news and pave the way for its potential integration, into real world scenarios.

Author Biographies

Muhammad Yusuf Ridho, Universitas Indonesia

The Faculty of Computer Science at Universitas Indonesia

Evi Yulianti, Universitas Indonesia

Evi Yulianti is a lecturer and researcher at Fasilkom UI.

Education:
Ph.D (Computer Science, Royal Melbourne Institute of Technology)
M.Comp.Sc. (Computer Science, Royal Melbourne Institute of Technology)
M.Kom. (Computer Science, University of Indonesia)
S.Kom. (Computer Science, University of Indonesia)

Downloads

Published

2024-09-10

Issue

Section

Articles