Mercury: Fast and Optimal Device Placement for Large Deep Learning Models

PROCEEDINGS OF THE 52ND INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING, ICPP 2023(2023)

引用 0|浏览10
暂无评分
摘要
The rapidly expanding neural network models are becoming increasingly challenging to run on a single device. Hence, model parallelism over multiple devices is critical to guaranteeing the efficiency of training large models. Recent proposals either have long processing time or poor performance. Therefore, we propose Mercury, a fast framework for optimizing device placement for large models. Mercury employs a simple but efficient model parallelization strategy in the baseline measurement, and generates placement policies through a series of scheduling algorithms. We conduct experiments to deploy and evaluate Mercury on numerous large models. The results show that Mercury not only reduces the placement policy generation time by 26.4% but also improves the model throughput by 218.5% compared to the most advanced methods.
更多
查看译文
关键词
Model Parallelism,Optimization,Scheduling
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要