A New Strategy of Direct Access for Speaker Identification System Based on Classification

Hery Heryanto, Saiful Akbar, Benhard Sitohang

Abstract


In this paper, we present a new direct access strategy for speaker identification system. DAMClass is a method for direct access strategy that speeds up the identification process without decreasing the identification rate drastically. This proposed method uses speaker classification strategy based on human voice’s original characteristics, such as pitch, flatness, brightness, and roll off. DAMClass decomposes available dataset into smaller sub-datasets in the form of classes or buckets based on the similarity of speaker’s original characteristics. DAMClass builds speaker dataset index based on range-based indexing of direct access facility and uses Nearest Neighbor Search, Range-Based Searching, and Multiclass-SVM Mapping as its access method. Experiments show that the direct access strategy with Multiclass-SVM algorithm outperforms the indexing accuracy of Range-Based Indexing and Nearest Neighbor for one to nine percent. DAMClass is shown to speed up the identification process 16 times faster than sequential access method with 91.05% indexing accuracy.

Keywords


direct access, speaker identification, MFCC, multiclass classification, speaker indexing

Full Text:

PDF

References


Heryanto H, Akbar S, Sitohang B. Direct Access in Content-Based Audio Information Retrieval: A State of The Art and Challenges. IEEE International Conference of Electrical Engineering and Informatics (ICEEI). Bandung. 2011; 2: 644-649.

Heryanto H, Akbar S, Sitohang B. A New Direct Access Framework for Speaker Identification System. IEEE International Conference on Data and Software Engineering (ICODSE). Bandung, 2014; 1: 7-11.

Kwon S, Narayanan S. Unsupervised Speaker Indexing Using Generic Models. IEEE Transactions on Speech and Audio Processing. 2005; 13(5): 1004-1013.

Schmidt L, Sharifi M, Moreno I L. Large-Scale Speaker Identification. IEEE International Conference on Acoustic, Speech and Signal Processing (ICASSP). Florence. 2014; 1: 1669-1673.

Indrawan G, Sitohang B, Akbar S. Review of Sequential Access Method for Fingerprint Identification. TELKOMNIKA. 2012; 10(2): 335-342.

Indrawan G, Sitohang B, Akbar S. Fingerprint Direct-Access Strategy Using Local-Star-Structurebased Discriminator Features: A Comparison Study. International Journal of Electrical and Computer Engineering (IJECE). 2014; 4(5): 817-830.

Ning W. Robust Speaker Recognition Using Denoised Vocal Source and Vocal Tract Features. IEEE Transaction on Audio, Speech, and Languange Processing. 2011; 19(1): 196-205.

Hosseinzadeh D, Krishnan S. Combining Vocal Source and MFCC Features for Enhanced Speaker Recognition Performance Using GMMs. IEEE 9th Workshop on Multimedia Signal Processing. Crete. 2007; 1: 365-368.

Karpov E. Real-Time Speaker Identification. Master Thesis. Joensuu: PostGraduate Department of Computer Science, University of Joensuu; 2003.

Reynolds D A, Rose R C. Robust Text-Independent Speaker Identification using Gaussian Mixture Speaker Models. IEEE Transactions on Speech and Audio Processing. 1995; 3(1): 72-83.

Chen W C, Hsieh C T, Hsu C H. Robust Speaker Identification System Based on Two-Stage Vector Quantization. Tamkang Journal of Science and Engineering. 2008; 11(4): 357-366.

Lartillot O, Toiviainen P, Eerola T. A Matlab Toolbox for Music Information Retrieval. University of Jyvaskyla. Finlandia. 2007.

Giannakopoulos T. Some Basic Audio Features. Department of Informatics and Telecommunications, University of Athens. Greece. 2010.

Maltoni D, Maio D, Jain A K, Prabakhar S. Handbook of Fingerprint Recognition. London: Springer. 2009.

Reda A, Panjwani S, Cutrell E. Hyke: A Low-Cost Remote Attendance Tracking System for Developing Regions. Proceedings of the 5th ACM workshop on Networked systems for developing regions. New York. 2011; 1: 15-20.

Dijk E T, Jagannathan S R, Wang D. Voice-based Human Recognition. Eindhoven University of Technology. 2011.




DOI: http://dx.doi.org/10.12928/telkomnika.v13i4.2017

Article Metrics

Abstract view : 160 times
PDF - 124 times

Refbacks

  • There are currently no refbacks.


Copyright (c) 2015 Universitas Ahmad Dahlan

TELKOMNIKA Telecommunication, Computing, Electronics and Control
ISSN: 1693-6930, e-ISSN: 2302-9293
Universitas Ahmad Dahlan, 4th Campus, 9th Floor, LPPI Room
Jl. Ringroad Selatan, Kragilan, Tamanan, Banguntapan, Bantul, Yogyakarta, Indonesia 55191
Phone: +62 (274) 563515, 511830, 379418, 371120 ext. 4902, Fax: +62 274 564604

Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.

View TELKOMNIKA Stats