Analysis of DBSCAN and K-means algorithm for evaluating outlier on RFM model of customer behaviour
Siti Monalisa, Fitra Kurnia
The aim of study is to discover outlier of customer data to found customer behaviour. The customer behaviour determined with RFM (Recency, Frequency and Monetary) models with K-Mean and DBSCAN algorithm as clustering customer data. There are six step in this study. The first step is determining the best number of clusters with the dunn index (DN) validation method for each algorithm. Based on the dunn index, the best cluster values were 2 clusters with DN value for DBSCAN 1.19 which were minpts and epsilon value 0.2 and 3 and DN for K-Means was 1.31. The next step was to cluster the dataset with the DBSCAN and K-Means algorithm based on the best cluster that was 2. DBSCAN algorithm had 37 outliers data and K-means algorithm had 63 outliers (cluster 1 are 26 outliers and cluster 2 are 37 outliers). This research shown that outlier in DBSCAN and K-Means in cluster 1 have similarities is 100%. But overal outliers similarities is 67%. Based the outliers shown that the behaviour of customers is a small frequency of spending but high recency and monetary.
customer behavior; DBSCAN; dunn index; K-means; outlier and RFM models;