Computer Science
An effective strategy for churn prediction and customer profiling
Published on - Data and Knowledge Engineering
Customer churn prediction and profiling are two major economic concerns for many companies. Different learning approaches have been proposed, however the a priori choice of the most suitable model to perform both tasks remains non-trivial as it is highly dependent on the intrinsic characteristics of the churn data. Our study compares eight supervised machine learning methods combined with seven sampling approaches on thirteen public churn data sets. Our evaluations, reported in terms of area under the curve (AUC), explore the influence of rebalancing strategies and data properties on the performance of learning methods. We rely on the Nemenyi test and Correspondence Analysis as means of visualizing the association between models, rebalancing and data. This work identifies the most appropriate methods in an attrition context and proposes an effective pipeline based on an ensemble approach and deep autoencoders segmentation. Our strategy can enlighten marketing or human resources services on the behavioral patterns of customers and their attrition probability. The described experiments are fully reproducible and our proposal can be successfully applied to a wide range of churn-like datasets.