On Distributed Averaging For Stochastic K-Pca

ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019)(2019)

引用 22|浏览19
暂无评分
摘要
In the stochastic k-PCA problem, we are given i.i.d. samples from an unknown distribution over vectors, and the goal is to compute the top k eigenvalues and eigenvectors of the moment matrix. In the simplest distributed variant, we have m machines each of which receives n samples. Each machine performs some computation and sends an O(k)-size summary of the local dataset to a central server. The server performs an aggregation and computes the desired eigenvalues and vectors. The goal is to achieve the same effect as the server computing using mn samples by itself. The main choices in this framework are the choice of the summary, and the method of aggregation. We consider a slight variant of the well-studied distributedaveraging approach, and prove that this leads to significantly better bounds on the dependence between n and the eigenvalue gaps. Our method can also be applied directly to a setting where the 'right' value of the parameter k (i.e., one for which there is a non-trivial eigenvalue gap) is not known exactly. This is a common issue in practice which prior methods were unable to address.
更多
查看译文
关键词
moment matrix,server computing
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要