Predicting the Level of Emotion by Means of Indonesian Speech Signal

Fergyanto E. Gunawan, Kanyadian Idananta


Understanding human emotion is of importance for developing better and facilitating smooth interpersonal relations. It becomes much more important because human thinking process and behavior are strongly influenced by the emotion. Align with these needs, an expert system that capable of predicting the emotion state would be useful for many practical applications. Based on a speech signal, the system has been widely developed for various languages. This study intends to evaluate to which extent Mel-Frequency Cepstral Coefficients (MFCC) features, besides Teager energy feature, derived from Indonesian speech signal relates to four emotional types: happy, sad, angry, and fear. The study utilizes empirical data of nearly 300 speech signals collected from four amateur actors and actresses speaking 15 prescribed Indonesian sentences. Using support vector machine classifier, the empirical findings suggest that the Teager energy, as well as the first coefficient of MFCCs, are a crucial feature and the prediction can achieve the accuracy level of 86%. The accuracy increases quickly with a few initial MFCC features. The fourth and more features have negligible effects on the accuracy.


Automatic Emotion Recognition; Indonesian Language; Mel Frequency Cepstral Coefficient; Support Vector Machine; Sound Features

Full Text:



Article Metrics

Abstract view : 294 times
PDF - 214 times


  • There are currently no refbacks.

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

TELKOMNIKA Telecommunication, Computing, Electronics and Control
ISSN: 1693-6930, e-ISSN: 2302-9293
Universitas Ahmad Dahlan, 4th Campus, 9th Floor, LPPI Room
Jl. Ringroad Selatan, Kragilan, Tamanan, Banguntapan, Bantul, Yogyakarta, Indonesia 55191
Phone: +62 (274) 563515, 511830, 379418, 371120 ext. 4902, Fax: +62 274 564604