Distance Functions Study in Fuzzy C-Means Core and Reduct Clustering

Authors

  • Joko Eliyanto Master Student, Math. Education, Universitas Ahmad Dahlan
  • Sugiyarto Surono Universitas Ahmad Dahlan, Yogyakarta

DOI:

https://doi.org/10.26555/jiteki.v7i1.20516

Keywords:

Fuzzy C-Means, Objective Function, FCM Distance Function

Abstract

Fuzzy C-Means is a distance-based clustering process which applied by fuzzy logic concept. Clustering process worked in linear to the iteration process to minimizing the objective function. The objective function is an addition of the multiplication between the coordinates distance towards their closest cluster centroid and their membership degree. The more the iteration process, the objective function should get lower and lower. The objective of this research is to observe whether the distances which usually applied are able to fulfill the aforementioned hypothesis for determining the most suitable distance for Fuzzy C-Means clustering application. Few distance function was applied in the same dataset. 5 standard datasets and 2 random datasets were used to test the fuzzy c-means clustering performance with the 7 different distance function. Accuracy, purity, and Rand Index also applied to measure the quality of the resulted cluster. The observation result depicted that the distance function which resulted in the best quality of clusters are Euclidean, Average, Manhattan, Minkowski, Minkowski-Chebisev, and Canberra distance. These 6 distances were able to fulfill the basic hypothesis of the objective function behavior on Fuzzy C-Means Clustering method. The only distance who were not able to fulfill the basic hypothesis is Chebisev distance.

References

B. Marr, Big Data in Practice, 1st ed., vol. 1, no. 1. West Sussex: Wiley, 2016. https://doi.org/10.1002/9781119278825

Y. Riahi and S. Riahi, “Big Data and Big Data Analytics: Concepts, Types and Technologies Big Data and Big Data Analytics: Concepts, Types and Technologies,†Int. J. Res. Eng., vol. 5, no. 9, pp. 524–528, 2018. https://doi.org/10.21276/ijre.2018.5.9.5

J. Eliyanto, Sugiyarto, Suparman, I. Djakaria, and M. A. H. Ruhama, “Dimension reduction using core and reduct to improve fuzzy C-means clustering performance,†Technol. Reports Kansai Univ., vol. 62, no. 6, pp. 2855–2867, 2020.

W. Purba, S. Tamba, and J. Saragih, “The effect of mining data k-means clustering toward students profile model drop out potential,†J. Phys. Conf. Ser., vol. 1007, no. 1, pp. 0–6, 2018. https://doi.org/10.1088/1742-6596/1007/1/012049

D. P. Ismi, S. Panchoo, and Murinto, “K-means clustering based filter feature selection on high dimensional data,†Int. J. Adv. Intell. Informatics, vol. 2, no. 1, pp. 38–45, 2016. https://doi.org/10.26555/ijain.v2i1.54

E. Hardika and S. Atmaja, “Implementation of k-Medoids Clustering Algorithm to Cluster Crime Patterns in Yogyakarta,†Int. J. Appl. Sci. Smart Technol., vol. 1, no. 1, pp. 33–44, 2019. https://doi.org/10.24071/ijasst.v1i1.1859

K. V. Rajkumar, A. Yesubabu, and K. Subrahmanyam, “Fuzzy clustering and Fuzzy C-Means partition cluster analysis and validation studies on a subset of CiteScore dataset,†Int. J. Electr. Comput. Eng., vol. 9, no. 4, pp. 2760–2770, 2019. https://doi.org/10.11591/ijece.v9i4.pp2760-2770

A. Gosain and S. Dahiya, “Performance Analysis of Various Fuzzy Clustering Algorithms: A Review,†Procedia Comput. Sci., vol. 79, pp. 100–111, 2016. https://doi.org/10.1016/j.procs.2016.03.014

A. A. hussian Hassan, W. M. Shah, M. F. I. Othman, and H. A. H. Hassan, “Evaluate the performance of K-Means and the fuzzy C-Means algorithms to formation balanced clusters in wireless sensor networks,†Int. J. Electr. Comput. Eng., vol. 10, no. 2, pp. 1515–1523, 2020. https://doi.org/10.11591/ijece.v10i2.pp1515-1523

A. Nurzahputra, M. A. Muslim, and R. Kurniawan, “Online Fuzzy C-Means Clustering for Lecturer Performance Assessment Based on National and International Journal Publication,†in International Conference on Mathematics, Science, and Education, 2016.

S. Kapil and M. Chawla, “Performance Evaluation ofK-means Clustering Algorithm with Various Distance Metrics,†1st IEEE Int. Conf. Power Electron. Intell. Control Energy Syst., pp. 1–4, 2016. https://doi.org/10.5120/19360-0929

M. K. Arzoo, A. Prof, and K. Rathod, “K-Means algorithm with different distance metrics in spatial data mining with uses of NetBeans IDE 8.2,†Int. Res. J. Eng. Technol., vol. 4, no. 4, pp. 2363–2368, 2017.

B. Charulatha, P. Rodrigues, T. Chitralekha, and A. Rajaraman, “A Comparative study of different distance metrics that can be used in Fuzzy Clustering Algorithms,†Int. J. Emerg. Trends Technol. Comput. Sci., vol. 2013, pp. 2–5, 2013.

A. Singh, A. Yadav, and A. Rana, “K-means with Three Different Distance Metrics,†Int. J. Comput. Appl., vol. 67, no. 10, pp. 13–17, 2013. https://doi.org/10.5120/11430-6785

P. Grabusts, “The choice of metrics for clustering algorithms,†Vide. Tehnol. Resur. - Environ. Technol. Resour., vol. 2, no. 1, pp. 70–78, 2011. https://doi.org/10.17770/etr2011vol2.973

Mahatme and Boyar, “Impact of Distance Metrics on the Performace of K-Means and Fuzzy C-means Clustering - an Approach to Assess Student’s performance in E-Learning Environment,†International Journal of Advanced Research in Computer Science, vol. 9, no. 1, pp. 888–892, 2018. https://doi.org/10.26483/ijarcs.v9i1.5417

S. Surono and R. D. A. Putri, “Optimization of Fuzzy C-Means Clustering Algorithm with Combination of Minkowski and Chebyshev Distance Using Principal Component Analysis,†Int. J. Fuzzy Syst., vol. 23, no. 1, pp. 139–144, 2020. https://doi.org/10.1007/s40815-020-00997-5

A. S. Shirkhorshidi, S. Aghabozorgi, and T. Ying Wah, “A Comparison study on similarity and dissimilarity measures in clustering continuous data,†PLoS One, vol. 10, no. 12, pp. 1–20, 2015. https://doi.org/10.1371/journal.pone.0144059

B. R. A. Moreira et al., “Classifying Hybrids of Energy Cane for Production of Bioethanol and Cogeneration of Biomass-Based Electricity by Principal Component Analysis-Linked Fuzzy C-Means Clustering Algorithm,†J. Agric. Sci., vol. 11, no. 14, p. 246, 2019. https://doi.org/10.5539/jas.v11n14p246

M. M. Deris, N. Senan, Z. Abdullah, R. Mamat, and B. Handaga, “Dimensional reduction using conditional entropy for incomplete information systems,†Lect. Notes Comput. Sci. (including Subser. Lect. Notes Artif. Intell. Lect. Notes Bioinformatics), vol. 11657, pp. 263–272, 2019. https://doi.org/10.1007/978-3-030-25636-4_21

R. Zhao, L. Gu, and X. Zhu, “Combining fuzzy C-means clustering with fuzzy rough feature selection,†Appl. Sci., vol. 9, no. 4, 2019. https://doi.org/10.3390/app9040679

D. Dua and C. Graff, “UCI Machine Learning Repository.†University of California, School of Information and Computer Science., Irvine, 2019. http://archive.ics.uci.edu/ml

E. O. Rodrigues, “Combining Minkowski and Chebyshev: New Distances Proposal and Survey of Distances Metrics Using K-Nearest Neighbours Classifier,†Elsevier, 2018. https://doi.org/10.1016/j.patrec.2018.03.021

F. Wang, H. H. Franco-Penya, J. D. Kelleher, J. Pugh, and R. Ross, “An analysis of the application of simplified silhouette to the evaluation of k-means clustering validity,†Lect. Notes Comput. Sci. (including Subser. Lect. Notes Artif. Intell. Lect. Notes Bioinformatics), vol. 10358, 2017. https://doi.org/10.1007/978-3-319-62416-7_21

M. F. Dzulkalnine and R. Sallehuddin, “Missing data imputation with fuzzy feature selection for diabetes dataset,†SN Appl. Sci., vol. 1, no. 4, 2019. https://doi.org/10.1007/s42452-019-0383-x

M. Sammany and T. Medhat, “Dimensionality reduction using rough set approach for two neural networks-based applications,†Lect. Notes Comput. Sci. (including Subser. Lect. Notes Artif. Intell. Lect. Notes Bioinformatics), vol. 4585 LNAI, pp. 639–647, 2007. https://doi.org/10.1007/978-3-540-73451-2_67

Downloads

Published

2021-04-24

How to Cite

[1]
J. Eliyanto and S. Surono, “Distance Functions Study in Fuzzy C-Means Core and Reduct Clustering”, J. Ilm. Tek. Elektro Komput. Dan Inform, vol. 7, no. 1, pp. 118–130, Apr. 2021.

Issue

Section

Articles

Similar Articles

You may also start an advanced similarity search for this article.