In-Network Computation for Large-Scale Federated Learning Over Wireless Edge Networks

ArXiv(2023)

引用 3|浏览6
暂无评分
摘要
Most conventional Federated Learning (FL) models are using a star network topology where all users aggregate their local models at a single server (e.g., a cloud server). That causes significant overhead in terms of both communications and computing at the server, delaying the training process, especially for large scale FL systems with straggling nodes. This article proposes a novel edge network architecture that enables decentralizing the model aggregation process at the server, thereby significantly reducing the training delay for the whole FL network. Specifically, we design a highly-effective in-network computation framework (INC) consisting of a user scheduling mechanism, an in-network aggregation process (INA) which is designed for both primal- and primal-dual methods in distributed machine learning problems, and a network routing algorithm with theoretical performance bounds. The in-network aggregation process, which is implemented at edge nodes and cloud node, can adapt two typical methods to allow edge networks to effectively solve the distributed machine learning problems. Under the proposed INA, we then formulate a joint routing and resource optimization problem, aiming to minimize the aggregation latency. The problem turns out to be NP-hard, and thus we propose a polynomial time routing algorithm which can achieve near optimal performance with a theoretical bound. Simulation results showed that the proposed algorithm can achieve more than 99 $\%$ of the optimal solution and reduce the FL training latency, up to 5.6 times w.r.t other baselines. The proposed INC framework can not only help reduce the FL training latency but also significantly decrease cloud's traffic and computing overhead. By embedding the computing/aggregation tasks at the edge nodes and leveraging the multi-layer edge-network architecture, the INC framework can liberate FL from the star topology to enable large-scale FL.
更多
查看译文
关键词
Mobile edge computing,federated learning,in-network computation,large-scale distributed learning
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要