A Novel Part-of-Speech Set Developing Method for Statistical Machine Translation

Herry Sujaini, Kuspriyanto Kuspriyanto, Arry Akhmad Arman, Ayu Purwarianti

Abstract


Part of speech (PoS) is one of the features that can be used to improve the quality of statistical-based machine translation. Typically, the language PoS determined based grammar of the language or adopt from other languages PoS. This work aims to formulate a model to developing PoS as linguistic factors to improve the quality of machine translation automatically. The research method using word similarity approach, where we perform clustering of the words contained in a corpus. Further classes will be defined as PoS set obtained for a given language.We evaluated the results of the PoS that defined computational results using machine translation system MOSES as the system by comparing the results of the SMT are using PoS sets generated manually, while the assessment of the system using BLEU method. Language that will be used for evaluation is English as the source language and Indonesian as the target language.

Keywords


method; part-of-speech; statistical machine translation; moses; word similarity

Full Text:

PDF


DOI: http://dx.doi.org/10.12928/telkomnika.v12i3.79

Article Metrics

Abstract view : 237 times
PDF - 180 times

Refbacks

  • There are currently no refbacks.


Copyright (c) 2014 Universitas Ahmad Dahlan

TELKOMNIKA Telecommunication, Computing, Electronics and Control
ISSN: 1693-6930, e-ISSN: 2302-9293
Universitas Ahmad Dahlan, 4th Campus, 9th Floor, LPPI Room
Jl. Ringroad Selatan, Kragilan, Tamanan, Banguntapan, Bantul, Yogyakarta, Indonesia 55191
Phone: +62 (274) 563515, 511830, 379418, 371120 ext. 4902, Fax: +62 274 564604

Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.

View TELKOMNIKA Stats