Reducing Overfitting in Neural Networks for Text Classification Using Kaggle's IMDB Movie Reviews Dataset

Poningsih Poningsih; Agus Perdana Windarto; Putrama Alkhairi

doi:10.26555/jiteki.v10i3.29509

Reducing Overfitting in Neural Networks for Text Classification Using Kaggle's IMDB Movie Reviews Dataset

Authors

Poningsih Poningsih STIKOM Tunas Bangsa
Agus Perdana Windarto STIKOM Tunas Bangsa
Putrama Alkhairi STIKOM Tunas Bangsa

DOI:

https://doi.org/10.26555/jiteki.v10i3.29509

Keywords:

Deep learning, Neural Networks, Overfitting, Text Classification, Regularization, Dropouts

Abstract

Overfitting presents a significant challenge in developing text classification models using neural networks, as it occurs when models learn too much from the training data, including noise and specific details, resulting in poor performance on new, unseen data. This study addresses this issue by exploring overfitting reduction techniques to enhance the generalization of neural networks in text classification tasks using the IMDB movie review dataset from Kaggle. The research aims to provide insights into effective methods to reduce overfitting, thereby improving the performance and reliability of text classification models in practical applications. The methodology involves developing two LSTM neural network models: a standard model without overfitting reduction techniques and an enhanced model incorporating dropout and early stopping. The IMDB dataset is preprocessed to convert reviews into sequences suitable for input into the LSTM models. Both models are trained, and their performances are compared using various metrics. The model without overfitting reduction techniques shows a test loss of 0.4724 and a test accuracy of 86.81%. Its precision, recall, and F1-score for classifying negative reviews are 0.91, 0.82, and 0.86, respectively, and for positive reviews are 0.84, 0.92, and 0.87. The enhanced model, incorporating dropout and early stopping, demonstrates improved performance with a lower test loss of 0.2807 and a higher test accuracy of 88.61%. For negative reviews, its precision, recall, and F1-score are 0.92, 0.84, and 0.88, and for positive reviews are 0.86, 0.93, and 0.89. Overall, the enhanced model achieves better metrics, with an accuracy of 89%, and macro and weighted averages for precision, recall, and F1-score all at 0.89. The applying overfitting reduction techniques significantly enhances the model's performance.

Downloads

Published

2024-09-09

How to Cite

[1]

P. Poningsih, A. P. Windarto, and P. Alkhairi, “Reducing Overfitting in Neural Networks for Text Classification Using Kaggle’s IMDB Movie Reviews Dataset”, J. Ilm. Tek. Elektro Komput. Dan Inform, vol. 10, no. 3, pp. 534–543, Sep. 2024.

Download Citation

Issue

Vol. 10 No. 3 (2024): September

Section

Articles

License

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

Authors who publish with JITEKI agree to the following terms:

Authors retain copyright and grant the journal the right of first publication with the work simultaneously licensed under a Creative Commons Attribution License (CC BY-SA 4.0) that allows others to share the work with an acknowledgment of the work's authorship and initial publication in this journal.
Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgment of its initial publication in this journal.
Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work.

This work is licensed under a Creative Commons Attribution 4.0 International License

About the Journal	Journal Policies	Author	Information
Focus and Scope Editorial Board Reviewer Open Access Policy Sponsorships Contact Us Google Scholar Most Cited Paper	Publication Ethics Peer Review Process Review Guideline Archiving Advertising	Author Guidelines Online Submission Publication Charge / Fee Plagiarism Policy Article Withdrawal	For Readers For Authors Journal History For Editor For Reviewer

Reducing Overfitting in Neural Networks for Text Classification Using Kaggle's IMDB Movie Reviews Dataset

Authors

DOI:

Keywords:

Abstract

Downloads

Published

How to Cite

Issue

Section

License

Similar Articles

Most read articles by the same author(s)

special_links

journal_metrics

current_indexing

journal_template_2

Make a Submission

sinta_certificate

visitor_country

visitors

Information