All Relevant Feature Selection Methods And Applications

FEATURE SELECTION FOR DATA AND PATTERN RECOGNITION(2015)

引用 69|浏览5
暂无评分
摘要
All-relevant feature selection is a relatively new sub-field in the domain of feature selection. The chapter is devoted to a short review of the field and presentation of the representative algorithm. The problem of all-relevant feature selection is first defined, then key algorithms are described. Finally the Boruta algorithm, under development at ICM, University of Warsaw, is explained in a greater detail and applied both to a collection of synthetic and real-world data sets. It is shown that algorithm is both sensitive and selective. The level of falsely discovered relevant variables is low-on average less than one falsely relevant variable is discovered for each set. The sensitivity of the algorithm is nearly 100% for data sets for which classification is easy, but may be smaller for data sets for which classification is difficult, nevertheless, it is possible to increase the sensitivity of the algorithm at the cost of increased computational effort without adversely affecting the false discovery level. It is achieved by increasing the number of trees in the random forest algorithm that delivers the importance estimate in Boruta.
更多
查看译文
关键词
All-relevant feature selection,Strong and weak relevance,Feature importance,Boruta,Random forest
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要