A Rademacher Complexity Based Method for Controlling Power and Confidence Level in Adaptive Statistical Analysis

arxiv(2019)

引用 5|浏览26
暂无评分
摘要
While standard statistical inference techniques and machine learning generalization bounds assume that tests are run on data selected independently of the hypotheses, practical data analysis and machine learning are usually iterative and adaptive processes where the same holdout data is often used for testing a sequence of hypotheses (or models), which may each depend on the outcome of the previous tests on the same data. In this work, we present RADABOUND a rigorous, efficient and practical procedure for controlling the generalization error when using a holdout sample for multiple adaptive testing. Our solution is based on a new application of the Rademacher Complexity generalization bounds, adapted to dependent tests. We demonstrate the statistical power and practicality of our method through extensive simulations and comparisons to alternative approaches. In particular, we show that our rigorous solution is a substantially more powerful and efficient than the differential privacy based approach proposed in Dwork et al. [1]-[3].
更多
查看译文
关键词
Adaptive Analysis,Rademacher Complexity,Statistical Learning
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要