Pushing the boundaries of parallel Deep Learning – A practical approach
arXiv: Distributed, Parallel, and Cluster Computing(2018)
摘要
This work aims to assess the state of the art of data parallel deep neural network training, trying to identify potential research tracks to be exploited for performance improvement. Beside, it presents a design for a practical C++ library dedicated at implementing and unifying the current state of the art methodologies for parallel training in a performance-conscious framework, allowing the user to explore novel strategies without departing significantly from its usual work-flow.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络