Comparative analysis of clustering algorithms for customer segmentation and improved marketing strategies
Authors
Vijay Vade, Parag
Issue Date
2024-05
Degree
MSc in Business Analytics
Publisher
Dublin Business School
Rights holder
Rights
Abstract
This study evaluates the effectiveness of K-Means, DBSCAN, and OPTICS clustering algorithms for customer segmentation. Using the Recency, Frequency, and Monetary (RFM) model, customer value was quantified, and customers were segment based on their transactional behavior.
Dataset was obtained from UCI machine learning repository and contained transaction details of an online retail business. The data underwent cleaning, feature engineering, normalization, and dimensionality reduction using UMAP. The clustering algorithms were then applied and evaluated using Silhouette Scores and Davies-Bouldin Indices.
K-Means effectively grouped customers, achieving a Silhouette Score of 0.445 and a Davies Bouldin Index of 0.736. DBSCAN handled noise and identified arbitrary shapes but produced scattered clusters with a lower Silhouette Score of 0.132 and a higher Davies-Bouldin Index of 1.435. Although OPTICS had similar scores to DBSCAN, it resulted in smoother clusters and handled varying densities more effectively than DBSCAN.
To summarize, K-Means provided the best cluster separation. DBSCAN and OPTICS were better for noise handling and variable densities.
