Accelerating wrapper-based feature selection with K-nearest-neighbor

Knowledge-Based Systems(2015)

引用 156|浏览138
暂无评分
摘要
We propose to accelerate wrapper-based feature selection with a KNN classifier.We construct a classifier distance matrix to evaluate the quality of a feature.The proposed approach can apply to three types of wrapper-based feature selectors.Theoretical time complexity analysis proves the efficiency of the proposed approach.Experimental results demonstrate its effectiveness and efficiency. Wrapper-based feature subset selection (FSS) methods tend to obtain better classification accuracy than filter methods but are considerably more time-consuming, particularly for applications that have thousands of features, such as microarray data analysis. Accelerating this process without degrading its high accuracy would be of great value for gene expression analysis. In this study, we explored how to reduce the time complexity of wrapper-based FSS with an embedded K-Nearest-Neighbor (KNN) classifier. Instead of considering KNN as a black box, we proposed to construct a classifier distance matrix and incrementally update the matrix to accelerate the calculation of the relevance criteria in evaluating the quality of the candidate features. Extensive experiments on eight publicly available microarray datasets were first conducted to demonstrate the effectiveness of the wrapper methods with KNN for selecting informative features. To demonstrate the performance gain in terms of time cost reduction, we then conducted experiments on the eight microarray datasets with the embedded KNN classifiers and analyzed the theoretical time/space complexity. Both the experimental results and theoretical analysis demonstrated that the proposed approach markedly accelerates the wrapper-based feature selection process without degrading the high classification accuracy, and the space complexity analysis indicated that the additional space overhead is affordable in practice.
更多
查看译文
关键词
Gene selection,Microarray data,Wrapper,Filter,k-nearest-neighbor
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要