On the necessity of irrelevant variables

The Journal of Machine Learning Research（2012）

引用 9|浏览19

暂无评分

摘要

This work explores the effects of relevant and irrelevant boolean variables on the accuracy of classifiers. The analysis uses the assumption that the variables are conditionally independent given the class, and focuses on a natural family of learning algorithms for such sources when the relevant variables have a small advantage over random guessing. The main result is that algorithms relying predominately on irrelevant variables have error probabilities that quickly go to 0 in situations where algorithms that limit the use of irrelevant variables have errors bounded below by a positive constant. We also show that accurate learning is possible even when there are so few examples that one cannot determine with high confidence whether or not any individual variable is relevant.

查看译文

关键词

main result,small advantage,high confidence,accurate learning,error probability,irrelevant variable,irrelevant boolean variable,natural family,individual variable,relevant variable

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要