A study of ensemble machine learning to improve telecommunication customer churn prediction
Authors
Raut, Nikhil Vilas
Issue Date
2020
Degree
MSc in Data Analytics
Publisher
Dublin Business School
Rights holder
Rights
Items in eSource are protected by copyright. Previously published items are made available in accordance with the copyright policy of the publisher/copyright holder.
Abstract
The telecom industry has an ever-increasing number of service providers. With the competition between service providers to attract more customers by making attractive offers, there is an incentive for customers to keep switching service providers. Service providers can stay ahead of the game by making proactive attractive offers and retain customers. Machine learning is applied to predict those customers most likely to churn. This research uses state of the art H2O stacked ensemble methods to improve the performance accuracy of the base models. The tools that are used for the research are RapidMiner, which is an analytical automation tool, and python coding implementing the H2O framework specifically designed for artificial intelligence. Traditional models are implemented in RapidMiner, namely the Generalized Linear Model, Gradient boosted trees, Deep Learning, and logistic regression. The models implemented in Python using the H2O framework are Stacked ensemble, deep learning, and distributed random forest trees. Base results are compared against ensemble methods and evaluated on various performance metrics.