WCLOUDVIZ: Word Cloud Visualization of Indonesian News Articles Classification Based on Latent Dirichlet Allocation

Retno Kusumaningrum, Satriyo Adhy, Suryono Suryono


Latent Dirichlet Allocation (LDA) is a widely implemented approach for extracting hidden topics in documents generated by soft clustering of a word based on document co-occurrence as a multinomial probability distribution over terms. Therefore, several visualizations have been developed, such as matrices design, text-based design, tree design, parallel coordinates, and force-directed graphs. Furthermore, based on a set of documents representing a class (category), we can implement classification task by comparing topic proportion for each class and topic proportion for the testing document by using Kullback-Leibler Divergence (KLD). Therefore, the purpose of this study is to develop a system for visualizing the output of LDA as a classification task. The visualization system consists of two parts: bar chart and dependent word cloud. The first visualization aims to show the trend of each category, while the second visualization aims to show the words that represent each selected category in a word cloud. This visualization is subsequently called WCloudViz. It provides clear, understandable and preferably shared the result.


latent dirichlet allocation; topic modeling; news articles classification; data visualization; word cloud

Full Text:


DOI: http://dx.doi.org/10.12928/telkomnika.v16i4.8194

Article Metrics

Abstract view : 129 times
PDF - 34 times


  • There are currently no refbacks.

Copyright (c) 2018 Universitas Ahmad Dahlan

TELKOMNIKA Telecommunication, Computing, Electronics and Control
ISSN: 1693-6930, e-ISSN: 2302-9293
Universitas Ahmad Dahlan, 4th Campus, 9th Floor, LPPI Room
Jl. Ringroad Selatan, Kragilan, Tamanan, Banguntapan, Bantul, Yogyakarta, Indonesia 55191
Phone: +62 (274) 563515, 511830, 379418, 371120 ext. 4902, Fax: +62 274 564604

Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.