Exploring nested ensemble learners using overproduction and choose approach for churn prediction in telecom industry

Neural Computing and Applications(2018)

引用 19|浏览34
暂无评分
摘要
Combining multiple classifiers to create hybrid learners (ensembles) has gained popularity in recent years. Ensembles are gaining more interest in the field of data mining as they have reportedly performed best predictions as compared to individual classifiers. This has resulted in experimentation with new ways of ensemble creation. This paper presents a study on creation of novel hybrid ways of combining multiple ensemble models using ‘over production and choose approach.’ In contrast to the original concept of ensembles that combine various learners, the proposed ensemble models comprise of combinations of other ensembles. In particular, we have combined learners as in composition of other learners, thus producing nested learners. Two such models named as Boosted-Stacked learners and Bagged-Stacked learners are proposed and are shown to outperform the traditional ensembles. Experiments are performed in churn prediction domain where a benchmark customer churn dataset (available on UCI repository) and a newly created dataset from a South Asian wireless telecom operator (named as SATO) are used. SATO dataset is created as balanced dataset (having equal number of churners and non-churners). The novel Boosted-Stacked learner and Bagged-Stacked learner achieved accuracies of 98.4% and 97.2%, respectively, on the UCI Churn dataset outperforming the existing state-of-the-art techniques. Furthermore, a high accuracy on the SATO dataset validates the effectiveness of the proposed models on balanced as well as imbalanced datasets.
更多
查看译文
关键词
Data mining, Churn prediction, Classification, Ensembles, Telecommunication industry
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要