Does Federated Dropout actually work?

IEEE Conference on Computer Vision and Pattern Recognition(2022)

引用 13|浏览23
暂无评分
摘要
Model sizes are limited in Federated Learning due to network bandwidth and on-device memory constraints. The success of increasing model sizes in other machine learning domains motivates the development of methods for training large-scale models in Federated Learning. To this end, [3] draws inspiration from dropout and proposes Federated Dropout: an algorithm where clients train randomly selected subsets of a larger server model. Despite the promising empirical results and the many other works that build on it [1],[8],[13], we argue in this paper that the metrics used to measure the performance of Federated Dropout and its variants are misleading. We propose and perform new experiments which suggest that Federated Dropout is actually detrimental to scaling efforts. We show how a simple ensembling technique outperforms Federated Dropout and other baselines. We perform ablations that suggest that the best performing variations of Federated Dropout approximate ensembling. The simplicity of ensembling allows for easy, practical implementations. Furthermore, ensembling naturally leverages the parallelizable nature of Federated Learning— recall that it is easy to train several models independently because there are a lot of clients and server-compute is not the bottleneck. Ensembling’s strong performance against our baselines suggests that Federated Learning models may be more easily scaled than previously thought with more sophisticated ensembling strategies e.g., via boosting.
更多
查看译文
关键词
Federated Dropout,model sizes,Federated Learning models,machine learning domains,ensembling strategies
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要