Federated Variance-Reduced Stochastic Gradient Descent With Robustness To Byzantine Attacks

IEEE TRANSACTIONS ON SIGNAL PROCESSING(2020)

引用 115|浏览100
暂无评分
摘要
This paper deals with distributed finite-sum optimization for learning over multiple workers in the presence of malicious Byzantine attacks. Most resilient approaches so far combine stochastic gradient descent (SGD) with different robust aggregation rules. However, the sizeable SGD-induced stochastic gradient noise challenges discerning malicious messages sent by the Byzantine attackers from noisy stochastic gradients sent by the 'honest' workers. This motivates reducing the variance of stochastic gradients as a means of robustifying SGD. To this end, a novel Byzantine attack resilient distributed (Byrd-) SAGA approach is introduced for federated learning tasks involving multiple workers. Rather than the mean employed by distributed SAGA, the novel Byrd-SAGA relies on the geometric median to aggregate the corrected stochastic gradients sent by the workers. When less than half of the workers are Byzantine attackers, Byrd-SAGA attains provably linear convergence to a neighborhood of the optimal solution, with the asymptotic learning error determined by the number of Byzantine workers. Numerical tests corroborate the robustness to various Byzantine attacks, as well as the merits of Byrd-SAGA over Byzantine attack resilient distributed SGD.
更多
查看译文
关键词
Robustness, Stochastic processes, Distributed databases, Optimization, Convergence, Noise measurement, Task analysis, Distributed finite-sum optimization, Byzantine attacks, gradient noise, variance reduction
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要