High Dimensional Robust Sparse Regression

INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 108(2019)

引用 60|浏览134
暂无评分
摘要
We provide a novel -- and to the best of our knowledge, the first -- algorithm for high dimensional sparse regression with a constant fraction of corruptions in explanatory and/or response variables. Our algorithm recovers the true sparse parameters in the presence of a constant fraction of arbitrary corruptions. Our main contribution is a robust variant of Iterative Hard Thresholding. Using this, we provide accurate estimators with sample complexity sub-linear in $d$: when the covariance matrix in sparse regression is identity, our error guarantee is near information-theoretically optimal. We propose a filtering algorithm which consists of a novel randomized outlier removal technique for robust sparse mean estimation that may be of interest in its own right: it is orderwise more efficient computationally than existing algorithms, and succeeds with high probability, thus making it suitable for general use in iterative algorithms. We then deal with robust sparse regression with unknown covariance matrix, where our algorithm achieves the best known error guarantee for any polynomial time statistical query algorithms for a wide class of structured covariance matrices; and our algorithm only requires sub-linear sample complexity. We demonstrate the effectiveness on large-scale sparse regression problems with arbitrary corruptions.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要