Film Recommendation System Using Content-Based Filtering and the Convolutional Neural Network (CNN) Classification Methods
DOI:
https://doi.org/10.26555/jiteki.v9i4.28113Keywords:
Recommender System, Twitter, Content Based Filtering, Word Embedding, RoBERTa, TFIDF, Classification, Convolutional Neural NetworkAbstract
Managing large amounts of data is a challenge faced by users, so a recommendation system is needed as an information filter to provide relevant item suggestions. Twitter is often used to find information about movie reviews that can be used a basis for developing recommendation systems. This research contributes to applying content-based filtering in the context of Convolutional Neural Network (CNN). To the best of the researcher's knowledge, there has been no research addressing this combination of method and classification. The main focus is to evaluate the development of a recommendation system by integrating and comparing similarity identification methods using the RoBERTa and TF-IDF approaches. In this research, Roberta and TF-IDF as vectorizer and classification methods are applied to form a model that can recognize patterns in data and produce accurate predictions based on its features. The total data used is 854 movies and 34086 film reviews from 44 Twitter accounts. The SMOTE method was applied as a technique to overcome data imbalance. The research was conducted three times with increasing accuracy results. The first experiment TF-IDF as baseline, SMOTE on CNN classification. The second experiment, applying baseline, SMOTE, embedding on CNN classification. The third experiment applied baseline, SMOTE, embedding, and optimizer to CNN classification. The experimental results show that TF-IDF as baseline, SMOTE, embedding and SGD optimizer with the best learning rate on CNN classification can provide optimal results with an accuracy rate of 86.41%. Thus, the system can provide relevant movie recommendations with good prediction accuracy and performance.Downloads
Published
2024-02-12
Issue
Section
Articles
License
Authors who publish with JITEKI agree to the following terms:
- Authors retain copyright and grant the journal the right of first publication with the work simultaneously licensed under a Creative Commons Attribution License (CC BY-SA 4.0) that allows others to share the work with an acknowledgment of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgment of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work.
This work is licensed under a Creative Commons Attribution 4.0 International License