Stemming javanese affix words using nazief and adriani modifications
Abstract
Stemming is the process of finding a basic word with several stages of affix removal. The main reason for stemming is to check spelling and machine translation and to support the effectiveness of the retrieval process. This study uses the Nazief and Adriani algorithm for stemming Javanese-influenced words. The first step taken is data collection and making a basic word dictionary. Then do the stemming process. Before stemming, modifications are made to the rules. The rules of the Nazief and Adriani algorithm, which are based on the morphology rules of the Indonesian language, are modified to suit the morphological rules of the Javanese language. Of the 366 words that were tested, it produced 351 correct basic words and 15 basic words that experienced errors. The results show that this algorithm can be used for stemming Javanese with an accuracy value of 95.9%.References
F. Amin, W. Hadikurniawati, S. Wibisono, H. Februariyanti, and J. S. Wibowo, “A Hybrid Method of Rule-Based and String Matching Stemmer for Javanese Language,†J. Theor. Appl. Inf. Technol., vol. 95, no. 19, pp. 4973–4982, 2017.
N. J. Smith-hefner, “Language Shift, Gender, and Ideologies of Modernity in Central Java, Indonesia,†vol. 19, no. 1, pp. 57–77, 2009.
D. E. Subroto, M. Dwirahardjo, and B. Setiawan, “Endangered Krama and Krama Inggil Varieties of the Javanese Language,†Linguist. Indones., vol. 26, no. 1, pp. 89–96, 2008.
S. Suwadji, “Javanese Language Today,†in Lokakarya Pengajaran Bahasa dan Sastra Jawa, 1996, pp. 55–61.
M. Madia, “Stemming Bahasa Jawa untuk Mencari Akar Kata dalam Bahasa Jawa dengan Aturan Analisis Kontrasif Afiksasi Verba,†Universitas Islam Negeri Maulana Malik Ibrahim, 2016.
A. Setiyowati, “Intereferensi Morfologi dan Sintaks Bahasa Jawa dalam Bahasa Inidonesia pada Kolom ‘piye ya?’ Harian Suara Merdeka.†p. 76, 2008.
[A. D. Tahitoe and D. Purwitasari, “Implementasi Modifikasi Enhanced Confix Stripping Stemmer Untuk Bahasa Indonesia Dengan Metode Corpus Based Stemming,†in Jurnal Ilmiah, 2010, pp. 1–15.
V. Gupta, N. Joshi, and B. Vidyapith, “Design & Development of a Rule Based Urdu Lemmatizer,†1st Int. Conf. Futur. Trend Comput. Anal. Knowl. Manag. IEEE, no. July, 2015.
C. Moral, A. de Antonio, R. Imbert, and J. RamÃrez, “A Survey of Stemming Algorithms in Information Retrieval,†Inf. Res. An Int. Electron. J., p. 22, 2014.
J. Asian, “Effective Techniques for Indonesian Text Retrieval,†Ph.D Thesis, pp. 1–286, 2007.
J. Asian, B. Nazief, and H. Williams, “Stemming Indonesian : A confix-Stripping Approach,†no. January, 2007.
M. S. H. Simarangkir, “Studi Perbandingan Algoritma - Algoritma Stemming untuk Dokumen Teks Bahasa Indonesia,†J. Inkofar, vol. 1, no. 1, pp. 41–47, 2017.
P. Purwadi, Struktur Bahasa Jawa. Yogyakarta: Media Abadi, 2005.
A. B. Setiyanto, Parama Sastra Bahasa Jawa. Yogyakarta: Panji Pustaka, 2007.
A. B. Setiyanto, Parama Satra: Javanese Language. Yogyakarta: Panji Pustaka, 2007.
J. Wibowo, “Aplikasi Penentuan Kata Dasar dari Kata Berimbuhan pada Kalimat Bahasa Indonesia dengan Algoritma Stemming,†J. Ris. Komput., vol. 3, no. 5, pp. 346–350, 2016.
Downloads
Published
Issue
Section
License
Authors who publish with Jurnal Informatika (JIFO) agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License (CC BY-SA 4.0) that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work.
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.