Feature Selection Method Based on Improved Document Frequency

Wei Zheng, Guohe Feng

Abstract


Feature selection is an important part of the process of text classification, there is a direct impact on the quality of feature selection because of the evaluation function. Document frequency (DF) is one of several commonly methods used feature selection, its shortcomings is the lack of theoretical basis on function construction, it will tend to select high-frequency words in selecting. To solve the problem, we put forward a improved algorithm named DFM combined with class distribution of characteristics and realize the algorithm with programming, DFM were compared with some feature selection method commonly used with experimental using support vector machine, as text classification .The results show that, when feature selection, the DFM methods performance is stable at work and is better than other methods in classification results.

Full Text:

PDF PDF


DOI: http://dx.doi.org/10.12928/telkomnika.v12i4.536

Refbacks

  • There are currently no refbacks.


Copyright (c) 2014 Universitas Ahmad Dahlan

TELKOMNIKA Telecommunication, Computing, Electronics and Control
ISSN: 1693-6930, e-ISSN: 2302-9293
Universitas Ahmad Dahlan, 4th Campus, 9th Floor, LPPI Room
Jl. Ringroad Selatan, Kragilan, Tamanan, Banguntapan, Bantul, Yogyakarta, Indonesia 55191
Phone: +62 (274) 563515, 511830, 379418, 371120 ext. 4902, Fax: +62 274 564604

Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.

View TELKOMNIKA Stats