Machine learning insight into e-commerce churn: Prediction and preventing customer loss

Authors

Yadav, Rekha

Issue Date

2024-05

Degree

MSc in Business Analytics

Publisher

Dublin Business School

Rights

Abstract

The customer churn prediction is the key for e-commerce companies to create the retention strategies and stay in the competition. The purpose of this study is to overcome the issue of customer churn by prediction by utilizing the means of machine learning. The study employs the CRISP-DM paradigms, implementing and evaluating machine learning models: Random Forest Classifier, Logistic Regression, XGBoost and AdaBoost. The models are built and tested on the "E-commerce Customer Behavior and Purchase Dataset”. Hyperparameter tuning and performance evaluation are carried out to get the best from each model. The XGBoost Classifier is the best out of all the models in the accuracy, precision, recall and F1 score. The research is about the problems such as skewness, parameter validation and model bias and it gives the solutions which are oversampling, under-sampling, grid search and cross-validation. The next step in this research is the improvement of feature engineering, the implementation of real-time retention strategy and the increased model interpretability for the actionable insights. The research results add to the existing body of knowledge on the prediction of customer churn in e-commerce and help to establish the basis for the development of the proactive retention strategies. The approach can be used in other industries which are also facing the same problems. This research illustrates the achievement of machine learning, especially the XGBoost model, in the prediction of customer churn and underlines the significance of data-driven decision making in the challenging e-commerce environment.