A study of ensemble machine learning to improve telecommunication customer churn prediction

Authors

Raut, Nikhil Vilas

Issue Date

2020

Degree

MSc in Data Analytics

Publisher

Dublin Business School

Rights

Items in eSource are protected by copyright. Previously published items are made available in accordance with the copyright policy of the publisher/copyright holder.

Abstract

The telecom industry has an ever-increasing number of service providers. With the competition between service providers to attract more customers by making attractive offers, there is an incentive for customers to keep switching service providers. Service providers can stay ahead of the game by making proactive attractive offers and retain customers. Machine learning is applied to predict those customers most likely to churn. This research uses state of the art H2O stacked ensemble methods to improve the performance accuracy of the base models. The tools that are used for the research are RapidMiner, which is an analytical automation tool, and python coding implementing the H2O framework specifically designed for artificial intelligence. Traditional models are implemented in RapidMiner, namely the Generalized Linear Model, Gradient boosted trees, Deep Learning, and logistic regression. The models implemented in Python using the H2O framework are Stacked ensemble, deep learning, and distributed random forest trees. Base results are compared against ensemble methods and evaluated on various performance metrics.