Customer Churn is a critical point of concern for organizations in the telecommunications industry. It is estimated that this industry has an approximate annual churn rate of 30% leading to a huge loss of revenue for organizations every year. Even though the telecom industry was one of the first adopters of data mining and machine learning techniques to gain meaningful insights from large sets of data, the issue of customer churn is still at large in this industry. This thesis presents a predictive analytics approach to improve customer churn in the telecom industry as well as the application of a technique typically used in retail contexts known as “cross-selling” or “market basket analysis”.
A publicly available telecom dataset was used for the analysis. K-Nearest Neighbor, Decision Tree, Naïve Bayes and Random Forest were the four classification algorithms that were used to predict customer churn in RapidMiner and R. Apriori and FP-Growth were implemented in RapidMiner to understand the associtations between the attributes in the dataset. The results show that Decision Tree and Random Forest are the two most accurate algorithms in predicting customer churn. The “cross-selling” results show that association algorithms are a practical solution to discover associations between these items and services in this industry. The discovery of patterns and frequent item sets can be used by telecom companies to engage customers and offer services in a unique manner that is beneficial to their operation.
Overall, the key drivers of churn are identified in this study and useful associations between products are established. This information can be used by companies to create personalised offers and campaigns for customers who are at risk of churning. The study also shows that association rules can help in identifying usage patterns, buying preferences, socio-economic influences of customers.