Understanding and Optimizing Packed Neural Network Training for Hyper-Parameter Tuning
International Conference on Management of Data(2020)
摘要
As neural networks are increasingly employed in machine learning practice,
how to efficiently share limited training resources among a diverse set of
model training tasks becomes a crucial issue. To achieve better utilization of
the shared resources, we explore the idea of jointly training multiple neural
network models on a single GPU in this paper. We realize this idea by proposing
a primitive, called pack. We further present a comprehensive empirical study of
pack and end-to-end experiments that suggest significant improvements for
hyperparameter tuning. The results suggest: (1) packing two models can bring up
to 40
and the improvement increases when packing more models; (2) the benefit of the
pack primitive largely depends on a number of factors including memory
capacity, chip architecture, neural network structure, and batch size; (3)
there exists a trade-off between packing and unpacking when training multiple
neural network models on limited resources; (4) a pack-aware Hyperband is up to
2.7x faster than the original Hyperband, with this improvement growing as
memory size increases and subsequently the density of models packed.
更多查看译文
关键词
neural network,hyper-parameter
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要