List-decodeable Linear Regression

arxiv(2019)

引用 68|浏览94
暂无评分
摘要
We give the first polynomial-time algorithm for robust regression in the list-decodable setting where an adversary can corrupt a greater than 1/2 fraction of examples. For any alpha < 1, our algorithm takes as input a sample {(x(i), y(i))}(i <= n) of n linear equations where alpha n of the equations satisfy y(i) = < x(i), l*> + zeta for some small noise zeta and (1 - alpha)n of the equations are arbitrarily chosen. It outputs a list L of size O(1/alpha) - a fixed constant - that contains an l that is close to l*. Our algorithm succeeds whenever the inliers are chosen from a certifiably anti-concentrated distribution D. As a corollary of our algorithmic result, we obtain a (d/alpha)(O(1/alpha 8)) time algorithm to find a O(1/alpha) size list when the inlier distribution is standard Gaussian. For discrete product distributions that are anti-concentrated only in regular directions, we give an algorithm that achieves similar guarantee under the promise that l* has all coordinates of the same magnitude. To complement our result, we prove that the anti-concentration assumption on the inliers is information-theoretically necessary. To solve the problem we introduce a new framework for list-decodable learning that strengthens the "identifiability to algorithms" paradigm based on the sum-of-squares method.
更多
查看译文
关键词
robust regression
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要