Classification Accuracy as a Proxy for Two Sample Testing.

arXiv: Learning(2016)

引用 25|浏览47
暂无评分
摘要
When data analysts train a classifier and check if its accuracy is significantly different from random guessing, they are implicitly and indirectly performing a hypothesis test (two sample testing) and it is of importance to ask whether this indirect method for testing is statistically optimal or not. Given that hypothesis tests attempt to maximize statistical power subject to a bound on the allowable false positive rate, while prediction attempts to minimize statistical risk on future predictions on unseen data, we wish to study whether a predictive approach for an ultimate aim of testing is prudent. We formalize this problem by considering the two-sample mean-testing setting where one must determine if the means of two Gaussians (with known and equal covariance) are the same or not, but the analyst indirectly does so by checking whether the accuracy achieved by Fisheru0027s LDA classifier is significantly different from chance or not. Unexpectedly, we find that the asymptotic power of LDAu0027s sample-splitting classification accuracy is actually minimax rate-optimal in terms of problem-dependent parameters. Since prediction is commonly thought to be harder than testing, it might come as a surprise to some that solving a harder problem does not create a information-theoretic bottleneck for the easier one. On the flip side, even though the power is rate-optimal, our derivation suggests that it may be worse by a small constant factor; hence practitioners must be wary of using (admittedly flexible) prediction methods on disguised testing problems.
更多
查看译文
关键词
Classification accuracy, two sample testing, high-dimensional asymptotics, Hotelling's T-2 test, linear discriminant analysis, permutation test
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要