On Principled Local Optimization Methods for Federated Learning
CoRR(2024)
摘要
Federated Learning (FL), a distributed learning paradigm that scales
on-device learning collaboratively, has emerged as a promising approach for
decentralized AI applications. Local optimization methods such as Federated
Averaging (FedAvg) are the most prominent methods for FL applications. Despite
their simplicity and popularity, the theoretical understanding of local
optimization methods is far from clear. This dissertation aims to advance the
theoretical foundation of local methods in the following three directions.
First, we establish sharp bounds for FedAvg, the most popular algorithm in
Federated Learning. We demonstrate how FedAvg may suffer from a notion we call
iterate bias, and how an additional third-order smoothness assumption may
mitigate this effect and lead to better convergence rates. We explain this
phenomenon from a Stochastic Differential Equation (SDE) perspective.
Second, we propose Federated Accelerated Stochastic Gradient Descent (FedAc),
the first principled acceleration of FedAvg, which provably improves the
convergence rate and communication efficiency. Our technique uses on a
potential-based perturbed iterate analysis, a novel stability analysis of
generalized accelerated SGD, and a strategic tradeoff between acceleration and
stability.
Third, we study the Federated Composite Optimization problem, which extends
the classic smooth setting by incorporating a shared non-smooth regularizer. We
show that direct extensions of FedAvg may suffer from the "curse of primal
averaging," resulting in slow convergence. As a solution, we propose a new
primal-dual algorithm, Federated Dual Averaging, which overcomes the curse of
primal averaging by employing a novel inter-client dual averaging procedure.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要