The fast cauchy transform and faster robust linear regression

SIAM J. Comput.(2016)

引用 112|浏览188
暂无评分
摘要
We provide fast algorithms for overconstrained ℓp regression and related problems: for an n × d input matrix A and vector b ε @@@@n, in O(nd log n) time we reduce the problem minxε@@@@d ||Ax − b||p to the same problem with input matrix à of dimension s × d and corresponding @@@@ of dimension s × 1. Here, à and @@@@ are a coreset for the problem, consisting of sampled and rescaled rows of A and b; and s is independent of n and polynomial in d. Our results improve on the best previous algorithms when n ≫ d, for all p ε [1, ∞) except p = 2; in particular, they improve the O(nd1.376+) running time of Sohler and Woodruff (STOC, 2011) for p = 1, that uses asymptotically fast matrix multiplication, and the O(nd5 log n) time of Dasgupta et al. (SICOMP, 2009) for general p, that uses ellipsoidal rounding. We also provide a suite of improved results for finding well-conditioned bases via ellipsoidal rounding, illustrating tradeoffs between running time and conditioning quality, including a one-pass conditioning algorithm for general ℓp problems. To complement this theory, we provide a detailed empirical evaluation of implementations of our algorithms for p = 1, comparing them with several related algorithms. Among other things, our empirical results clearly show that, in the asymptotic regime, the theory is a very good guide to the practical performance of these algorithms. Our algorithms use our faster constructions of well-conditioned bases for ℓp spaces and, for p = 1, a fast subspace embedding of independent interest that we call the Fast Cauchy Transform: a matrix Π: @@@@n ↦ @@@@O(d log d), found obliviously to A, that approximately preserves the ℓ1 norms: that is, ||Ax||1 ≈ ||ΠAx||1, for all x, with distortion O(d2+η log d), for an arbitrarily small constant η > 0; and, moreover, ΠA can be computed in O(nd log d) time. The techniques underlying our Fast Cauchy Transform include fast Johnson-Lindenstrauss transforms, low-coherence matrices, and rescaling by Cauchy random variables.
更多
查看译文
关键词
algorithms,design,robust regression,theory,computation of transforms
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要