Near-Data Processing in Memory Expander for DNN Acceleration on GPUs

IEEE Computer Architecture Letters(2021)

引用 4|浏览5
暂无评分
摘要
We propose a near-data processing (NDP) architecture that exploits a memory expander with byte-addressable memory-semantic interconnect to accelerate memory-bound operations in deep neural networks (DNNs). Our architecture can execute NDP operations on the memory traffic from the GPU on-the-fly by employing bump-in-the-wire NDP logic between the off-chip link and memory controller. In addition, the memory-bound operations executed on the NDP unit can be effectively overlapped with compute-intensive operations executed on a GPU, even if the two operations have a dependency. Furthermore, the NDP offloading can be automatically done by the compiler without any code modification by deep learning practitioners. Our approach can achieve a 51% speedup for training VGG-16 with batch normalization.
更多
查看译文
关键词
Deep neural network training,memory wall
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要