Statistical Versus Distance-Based Meta-Features For Clustering Algorithm Recommendation Using Meta-Learning

Bruno Almeida Pimentel, Andre C. P. L. E. De Carvalho

2018 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN)(2018)

引用 4|浏览19
暂无评分
摘要
When a Machine Learning algorithm is applied to a dataset, the predictive performance of the algorithm depends on how suitable its bias is to the the data distribution in the dataset, which leads researchers to create a large number of algorithms. The recommendation of the most suitable algorithm for a new dataset can occur by trial and error, trying a large number of algorithms with distinct bias. However, this approach usually has a high computational cost. This cost could be reduced if the most suitable algorithm(s) could be recommended. Meta-learning has been successfully used for recommendation of the best Machine Learning algorithm in several Machine Learning tasks. Meta-learning can rank algorithms according to their adequacy for a new dataset and use this ranking to recommend the algorithms to be used. As the recommended ranking is based on dataset features, dataset characterization (using meta features) is of crucial importance for the successful use of meta learning. Clustering is one of the main application of Machine Learning algorithms, however few works investigate the use of meta-learning for the recommendation of clustering algorithms. Moreover, the existing works use a poor methodology for the evaluation of the algorithm recommendation method and a small number of datasets. This paper proposes a comparison between two types of meta-features for clustering algorithm recommendation using meta-learning. Experimental results show in which situations the use of each type of meta-features is more suitable.
更多
查看译文
关键词
meta-learning,recommended ranking,dataset features,algorithm recommendation method,clustering algorithm recommendation,statistical versus distance-based meta-features,machine learning algorithm,data distribution
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要