Efficient-Adam: Communication-Efficient Distributed Adam.

Congliang Chen,Li Shen ,Wei Liu ,Zhi-Quan Luo

IEEE Trans. Signal Process.（2023）

引用 0|浏览9

暂无评分

摘要

Distributed adaptive stochastic gradient methods have been widely used for large-scale nonconvex optimization, such as training deep learning models. However, their communication complexity on finding

$\varepsilon$

-stationary points has rarely been analyzed in the nonconvex setting. In this work, we present a novel communication-efficient distributed Adam in the parameter-server model for stochastic nonconvex optimization, dubbed Efficient-Adam . Specifically, we incorporate a two-way quantization scheme into Efficient-Adam to reduce the communication cost between the workers and server. Simultaneously, we adopt a two-way error feedback strategy to reduce the biases caused by the two-way quantization on both the server and workers, respectively. In addition, we establish the iteration complexity for the proposed Efficient-Adam with a class of quantization operators, and further characterize its communication complexity between the server and workers when an

$\boldsymbol{\varepsilon}$

-stationary point is achieved. Finally, we apply Efficient-Adam to solve a toy stochastic convex optimization problem and train deep learning models on real-world vision and language tasks. Extensive experiments together with a theoretical guarantee justify the merits of Efficient Adam.

查看译文

关键词

efficient-adam,communication-efficient

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要