Randomized Distributed Mean Estimation: Accuracy vs Communication.

arXiv: Distributed, Parallel, and Cluster Computing(2016)

引用 66|浏览25
暂无评分
摘要
We consider the problem of estimating the arithmetic average of a finite collection of real vectors stored in a distributed fashion across several compute nodes subject to a communication budget constraint. Our analysis does not rely on any statistical assumptions about the source of the vectors. This problem arises as a subproblem in many applications, including reduce-all operations within algorithms for distributed and federated optimization and learning. We propose a flexible family of randomized algorithms exploring the trade-off between expected communication cost and estimation error. Our family contains the full-communication and zero-error method on one extreme, and an $epsilon$-bit communication and ${cal O}left(1/(epsilon n)right)$ error method on the opposite extreme. In the special case where we communicate, in expectation, a single bit per coordinate of each vector, we improve upon existing results by obtaining $mathcal{O}(r/n)$ error, where $r$ is the number of bits used to represent a floating point value.
更多
查看译文
关键词
communication efficiency,distributed mean estimation,accuracy-communication tradeoff,gradient compression,quantization
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要