Scheduling Large-Scale Distributed Training Via Reinforcement Learning

Zhanglin Peng,Jiamin Ren,Ruimao Zhang,Lingyun Wu,Xinjiang Wang,Ping Luo

2018 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA)（2018）

引用 0|浏览52

暂无评分

摘要

Scheduling the training procedure of deep neural networks (DNNs) such as tuning the learning rates is crucial to the successes of deep learning. Previous strategies such as piecewise and exponential learning rate schedulers have different arguments (hyper-parameters) that need to be tuned manually. With the expanding of data scale and model computation, searching for these arguments spends lots of empirical efforts. To address this issue, this work proposes policy schedular that determines the arguments of learning rate (lr) by reinforcement learning, significantly reducing costs to tune them. The policy schedular possesses several appealing benefits. First, instead of manually defining the values of initial lr and ultimate lr, it autonomously determines these values in training. Second, rather than using predefined functions to update lr, it adaptively oscillates lr by monitoring learning curves without human intervention. Third, it is able to select lr for each block or layer of a DNN. Experiments show that the DNNs trained with policy schedular achieve superior performances, outperforming previous work on various tasks and benchmarks such as ImageNet, COCO, and learning-to-learn.

查看译文

关键词

Optimization, Reinforcement Learning, Convolutional Neural Network, Deep Learning

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要