Distance Functions Study in Fuzzy C-Means  Core and Reduct Clustering

Joko Eliyanto; Sugiyarto Surono

doi:10.26555/jiteki.v7i1.20516

Authors

Joko Eliyanto Master Student, Math. Education, Universitas Ahmad Dahlan
Sugiyarto Surono Universitas Ahmad Dahlan, Yogyakarta

DOI:

https://doi.org/10.26555/jiteki.v7i1.20516

Keywords:

Fuzzy C-Means, Objective Function, FCM Distance Function

Abstract

Fuzzy C-Means is a distance-based clustering process which applied by fuzzy logic concept. Clustering process worked in linear to the iteration process to minimizing the objective function. The objective function is an addition of the multiplication between the coordinates distance towards their closest cluster centroid and their membership degree. The more the iteration process, the objective function should get lower and lower. The objective of this research is to observe whether the distances which usually applied are able to fulfill the aforementioned hypothesis for determining the most suitable distance for Fuzzy C-Means clustering application. Few distance function was applied in the same dataset. 5 standard datasets and 2 random datasets were used to test the fuzzy c-means clustering performance with the 7 different distance function. Accuracy, purity, and Rand Index also applied to measure the quality of the resulted cluster. The observation result depicted that the distance function which resulted in the best quality of clusters are Euclidean, Average, Manhattan, Minkowski, Minkowski-Chebisev, and Canberra distance. These 6 distances were able to fulfill the basic hypothesis of the objective function behavior on Fuzzy C-Means Clustering method. The only distance who were not able to fulfill the basic hypothesis is Chebisev distance.

References

B. Marr, Big Data in Practice, 1st ed., vol. 1, no. 1. West Sussex: Wiley, 2016. https://doi.org/10.1002/9781119278825

Y. Riahi and S. Riahi, â€œBig Data and Big Data Analytics: Concepts, Types and Technologies Big Data and Big Data Analytics: Concepts, Types and Technologies,â€ Int. J. Res. Eng., vol. 5, no. 9, pp. 524â€“528, 2018. https://doi.org/10.21276/ijre.2018.5.9.5

J. Eliyanto, Sugiyarto, Suparman, I. Djakaria, and M. A. H. Ruhama, â€œDimension reduction using core and reduct to improve fuzzy C-means clustering performance,â€ Technol. Reports Kansai Univ., vol. 62, no. 6, pp. 2855â€“2867, 2020.

W. Purba, S. Tamba, and J. Saragih, â€œThe effect of mining data k-means clustering toward students profile model drop out potential,â€ J. Phys. Conf. Ser., vol. 1007, no. 1, pp. 0â€“6, 2018. https://doi.org/10.1088/1742-6596/1007/1/012049

D. P. Ismi, S. Panchoo, and Murinto, â€œK-means clustering based filter feature selection on high dimensional data,â€ Int. J. Adv. Intell. Informatics, vol. 2, no. 1, pp. 38â€“45, 2016. https://doi.org/10.26555/ijain.v2i1.54

E. Hardika and S. Atmaja, â€œImplementation of k-Medoids Clustering Algorithm to Cluster Crime Patterns in Yogyakarta,â€ Int. J. Appl. Sci. Smart Technol., vol. 1, no. 1, pp. 33â€“44, 2019. https://doi.org/10.24071/ijasst.v1i1.1859

K. V. Rajkumar, A. Yesubabu, and K. Subrahmanyam, â€œFuzzy clustering and Fuzzy C-Means partition cluster analysis and validation studies on a subset of CiteScore dataset,â€ Int. J. Electr. Comput. Eng., vol. 9, no. 4, pp. 2760â€“2770, 2019. https://doi.org/10.11591/ijece.v9i4.pp2760-2770

A. Gosain and S. Dahiya, â€œPerformance Analysis of Various Fuzzy Clustering Algorithms: A Review,â€ Procedia Comput. Sci., vol. 79, pp. 100â€“111, 2016. https://doi.org/10.1016/j.procs.2016.03.014

A. A. hussian Hassan, W. M. Shah, M. F. I. Othman, and H. A. H. Hassan, â€œEvaluate the performance of K-Means and the fuzzy C-Means algorithms to formation balanced clusters in wireless sensor networks,â€ Int. J. Electr. Comput. Eng., vol. 10, no. 2, pp. 1515â€“1523, 2020. https://doi.org/10.11591/ijece.v10i2.pp1515-1523

A. Nurzahputra, M. A. Muslim, and R. Kurniawan, â€œOnline Fuzzy C-Means Clustering for Lecturer Performance Assessment Based on National and International Journal Publication,â€ in International Conference on Mathematics, Science, and Education, 2016.

S. Kapil and M. Chawla, â€œPerformance Evaluation ofK-means Clustering Algorithm with Various Distance Metrics,â€ 1st IEEE Int. Conf. Power Electron. Intell. Control Energy Syst., pp. 1â€“4, 2016. https://doi.org/10.5120/19360-0929

M. K. Arzoo, A. Prof, and K. Rathod, â€œK-Means algorithm with different distance metrics in spatial data mining with uses of NetBeans IDE 8.2,â€ Int. Res. J. Eng. Technol., vol. 4, no. 4, pp. 2363â€“2368, 2017.

B. Charulatha, P. Rodrigues, T. Chitralekha, and A. Rajaraman, â€œA Comparative study of different distance metrics that can be used in Fuzzy Clustering Algorithms,â€ Int. J. Emerg. Trends Technol. Comput. Sci., vol. 2013, pp. 2â€“5, 2013.

A. Singh, A. Yadav, and A. Rana, â€œK-means with Three Different Distance Metrics,â€ Int. J. Comput. Appl., vol. 67, no. 10, pp. 13â€“17, 2013. https://doi.org/10.5120/11430-6785

P. Grabusts, â€œThe choice of metrics for clustering algorithms,â€ Vide. Tehnol. Resur. - Environ. Technol. Resour., vol. 2, no. 1, pp. 70â€“78, 2011. https://doi.org/10.17770/etr2011vol2.973

Mahatme and Boyar, â€œImpact of Distance Metrics on the Performace of K-Means and Fuzzy C-means Clustering - an Approach to Assess Studentâ€™s performance in E-Learning Environment,â€ International Journal of Advanced Research in Computer Science, vol. 9, no. 1, pp. 888â€“892, 2018. https://doi.org/10.26483/ijarcs.v9i1.5417

S. Surono and R. D. A. Putri, â€œOptimization of Fuzzy C-Means Clustering Algorithm with Combination of Minkowski and Chebyshev Distance Using Principal Component Analysis,â€ Int. J. Fuzzy Syst., vol. 23, no. 1, pp. 139â€“144, 2020. https://doi.org/10.1007/s40815-020-00997-5

A. S. Shirkhorshidi, S. Aghabozorgi, and T. Ying Wah, â€œA Comparison study on similarity and dissimilarity measures in clustering continuous data,â€ PLoS One, vol. 10, no. 12, pp. 1â€“20, 2015. https://doi.org/10.1371/journal.pone.0144059

B. R. A. Moreira et al., â€œClassifying Hybrids of Energy Cane for Production of Bioethanol and Cogeneration of Biomass-Based Electricity by Principal Component Analysis-Linked Fuzzy C-Means Clustering Algorithm,â€ J. Agric. Sci., vol. 11, no. 14, p. 246, 2019. https://doi.org/10.5539/jas.v11n14p246

M. M. Deris, N. Senan, Z. Abdullah, R. Mamat, and B. Handaga, â€œDimensional reduction using conditional entropy for incomplete information systems,â€ Lect. Notes Comput. Sci. (including Subser. Lect. Notes Artif. Intell. Lect. Notes Bioinformatics), vol. 11657, pp. 263â€“272, 2019. https://doi.org/10.1007/978-3-030-25636-4_21

R. Zhao, L. Gu, and X. Zhu, â€œCombining fuzzy C-means clustering with fuzzy rough feature selection,â€ Appl. Sci., vol. 9, no. 4, 2019. https://doi.org/10.3390/app9040679

D. Dua and C. Graff, â€œUCI Machine Learning Repository.â€ University of California, School of Information and Computer Science., Irvine, 2019. http://archive.ics.uci.edu/ml

E. O. Rodrigues, â€œCombining Minkowski and Chebyshev: New Distances Proposal and Survey of Distances Metrics Using K-Nearest Neighbours Classifier,â€ Elsevier, 2018. https://doi.org/10.1016/j.patrec.2018.03.021

F. Wang, H. H. Franco-Penya, J. D. Kelleher, J. Pugh, and R. Ross, â€œAn analysis of the application of simplified silhouette to the evaluation of k-means clustering validity,â€ Lect. Notes Comput. Sci. (including Subser. Lect. Notes Artif. Intell. Lect. Notes Bioinformatics), vol. 10358, 2017. https://doi.org/10.1007/978-3-319-62416-7_21

M. F. Dzulkalnine and R. Sallehuddin, â€œMissing data imputation with fuzzy feature selection for diabetes dataset,â€ SN Appl. Sci., vol. 1, no. 4, 2019. https://doi.org/10.1007/s42452-019-0383-x

M. Sammany and T. Medhat, â€œDimensionality reduction using rough set approach for two neural networks-based applications,â€ Lect. Notes Comput. Sci. (including Subser. Lect. Notes Artif. Intell. Lect. Notes Bioinformatics), vol. 4585 LNAI, pp. 639â€“647, 2007. https://doi.org/10.1007/978-3-540-73451-2_67

About the Journal	Journal Policies	Author	Information
Focus and Scope Editorial Board Reviewer Open Access Policy Sponsorships Contact Us Google Scholar Most Cited Paper	Publication Ethics Peer Review Process Review Guideline Archiving Advertising	Author Guidelines Online Submission Publication Charge / Fee Plagiarism Policy Article Withdrawal	For Readers For Authors Journal History For Editor For Reviewer

Distance Functions Study in Fuzzy C-Means Core and Reduct Clustering

Authors

DOI:

Keywords:

Abstract

References

Downloads

Published

How to Cite

Issue

Section

License

Similar Articles

Most read articles by the same author(s)

special_links

journal_metrics

current_indexing

journal_template_2

Make a Submission

sinta_certificate

visitor_country

visitors

Information