Speech classification using combination virtual center of gravity and k-means clustering based on audio feature extraction

Authors

  • Diah Kumalasari Politeknik Negeri Samarinda
  • Arief Bramanto Wicaksono Putra Politeknik Negeri Samarinda
  • Achmad Fanany Onnilita Gaffar Politeknik Negeri Samarinda

Keywords:

Classification, Feature Extraction, K-Mean, Virtual Center of Gravity

Abstract

Voice recognition can be done in a variety of ways. Sound patterns can be recognized by performing sound feature extraction. The trainer sound data is built from the best sound data selection using a correlation coefficient based on the level of similarity between sound data for optimal sound features. Extraction of voting features on this research using the Virtual Center of Gravity method. This method calculates the distance between the sound data against the center point of gravity with visualizations in the 3-dimensional form of white, black, and grey pattern spaces. The preprocessing process generates a complex number of data consisting of real numbers and imaginary numbers. The number will be calculated the distance to the Virtual Center of Gravity's pattern space using Euclidean Distance. The sound feature testing is done using K-Means Clustering by means of a speech classification data based sound. The results showed an accuracy of 92.5%.

Author Biographies

Diah Kumalasari, Politeknik Negeri Samarinda

Information Technology

Arief Bramanto Wicaksono Putra, Politeknik Negeri Samarinda

Information Technology Department

Achmad Fanany Onnilita Gaffar, Politeknik Negeri Samarinda

Information Technology

References

B. Dave and P. D. S. Pipalia, “Speech Recognition: a Review,†Int. J. Adv. Eng. Res. Dev., vol. 1, no. 12, pp. 230–236, 2014, doi: 10.21090/ijaerd.011244.

K. R. Ghule and R. R. Deshmukh, “Feature-Extraction-Techniques-for-Speech-Recognition-A-Review.docx,†Int. J. Sci. Eng. Res., vol. 6, no. 5, pp. 143–147, 2015.

M. Ference and A. M. Weinberg, “Center of Gravity and Center of Mass,†Am. J. Phys., vol. 6, no. 2, pp. 106–106, 1938, doi: 10.1119/1.1991277.

Y. A. Ibrahim, J. C. Odiketa, and T. S. Ibiyemi, “Preprocessing technique in automatic speech recogntion for human computer interaction: an overview,†Ann. Comput. Sci. Ser., vol. XV, no. 1, pp. 186–191, 2017.

A. G. Jondya and B. H. Iswanto, “Indonesian’s Traditional Music Clustering Based on Audio Features,†Procedia Comput. Sci., vol. 116, pp. 174–181, 2017, doi: 10.1016/j.procs.2017.10.019.

O. Of and E. For, “PCA- Based P Almprint R Ecognition 1 Introduction 2 The Structure of palmprint verification systems 3 Feature normalization techniques,†Electr. Eng., no. i, pp. 2–5, 2009.

P. Schober and L. A. Schwarte, “Correlation coefficients: Appropriate use and interpretation,†Anesth. Analg., vol. 126, no. 5, pp. 1763–1768, 2018, doi: 10.1213/ANE.0000000000002864.

O. K. Hamid, “Frame Blocking and Windowing Speech Signal,†J. Information, Commun. Intell. Syst., vol. 4, no. 5, 2019.

H. Triwiyanto, O Wahyunggoro, H A Nugroho, “Performance Analysis of the Windowing Technique on Elbow Joint Angle Estimation Using Electromyography,†J. Phys., 2018.

H. Hauser, E. Gröller, and T. Theußl, “Mastering Windows: Improving Reconstruction,†2000 IEEE Symp. Vol. Vis. VV 2000, pp. 101–109, 2000, doi: 10.1109/VV.2000.10002.

R. Hibare and A. Vibhute, “Feature Extraction Techniques in Speech Processing: A Survey,†Int. J. Comput. Appl., vol. 107, no. 5, pp. 1–8, 2014, doi: 10.5120/18744-9997.

A. K. . F. Haque, “FFT and Wavelet-Based Feature Extraction for Acoustic Audio Classification,†Int. J. Adv. Innov. Thoughts Ideas, pp. 1–7, 2012.

A.B.W. Putra, S. Pramono, and A. Naba, “Rancang Bangun Prototype Ciri Citra Kulit Luar Kayu Tanaman Karet Menggunakan Metode Virtual Center of Gravity,†J. EECCIS, vol. 8, no. 1, p. pp.19-26, 2014.

A. V. D. Sano and H. Nindito, “Application OF K-means algorithm for cluster analysis on poverty of provinces in indonesia,†ComTech, no. 6, pp. 141–150, 2011.

O. Oyelade, Oladipupo, “Application of k-Means Clustering algorithm for prediction of Students ’ Academic Performance,†Int. J. Comput. Sci. Inf. Secur., vol. 7, pp. 292–295, 2010.

S. Saito, Y. Tomioka, and H. Kitazawa, “A Theoretical Framework for Estimating False Acceptance Rate of PRNU-Based Camera Identification,†IEEE Trans. Inf. Forensics Secur., vol. 12, no. 9, pp. 2026–2035, 2017, doi: 10.1109/TIFS.2017.2692683.

Downloads

Published

2020-05-29

Issue

Section

Articles