High-Dimensional Robust Mean Estimation in Nearly-Linear Time.

SODA '19: Symposium on Discrete Algorithms San Diego California January, 2019(2019)

引用 121|浏览98
暂无评分
摘要
We study the fundamental problem of high-dimensional mean estimation in a robust model where a constant fraction of the samples are adversarially corrupted. Recent work gave the first polynomial time algorithms for this problem with dimension-independent error guarantees for several families of structured distributions. In this work, we give the first nearly-linear time algorithms for high-dimensional robust mean estimation. Specifically, we focus on distributions with (i) known covariance and subgaussian tails, and (ii) unknown bounded covariance. Given N samples on Rd, an ε-fraction of which may be arbitrarily corrupted, our algorithms run in time Õ(Nd)/poly(ε) and approximate the true mean within the information-theoretically optimal error, up to constant factors. Previous robust algorithms with comparable error guarantees have running times [MATH HERE], for ε = Ω(1). Our algorithms rely on a natural family of SDPs parameterized by our current guess v for the unknown mean μ*. We give a win-win analysis establishing the following: either a near-optimal solution to the primal SDP yields a good candidate for μ* --- independent of our current guess v --- or a near-optimal solution to the dual SDP yields a new guess v' whose distance from μ* is smaller by a constant factor. We exploit the special structure of the corresponding SDPs to show that they are approximately solvable in nearly-linear time. Our approach is quite general, and we believe it can also be applied to obtain nearly-linear time algorithms for other high-dimensional robust learning problems.
更多
查看译文
关键词
robust,estimation,high-dimensional,nearly-linear
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要