FedRDMA: Communication-Efficient Cross-Silo Federated LLM via Chunked RDMA Transmission
arxiv(2024)
摘要
Communication overhead is a significant bottleneck in federated learning
(FL), which has been exaggerated with the increasing size of AI models. In this
paper, we propose FedRDMA, a communication-efficient cross-silo FL system that
integrates RDMA into the FL communication protocol. To overcome the limitations
of RDMA in wide-area networks (WANs), FedRDMA divides the updated model into
chunks and designs a series of optimization techniques to improve the
efficiency and robustness of RDMA-based communication. We implement FedRDMA
atop the industrial federated learning framework and evaluate it on a
real-world cross-silo FL scenario. The experimental results show that can
achieve up to 3.8× speedup in communication efficiency compared to
traditional TCP/IP-based FL systems.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要