A Versatile ReRAM-based Accelerator for Convolutional Neural Networks

2018 IEEE International Workshop on Signal Processing Systems (SiPS)(2018)

引用 7|浏览12
暂无评分
摘要
Though recent progress in resistive random access memory (ReRAM)-based accelerator designs for convolutional neural networks (CNN) achieve superior timing performance and area-efficiency improvements over CMOS-based accelerators, they have high energy consumptions due to low inter-layer data reuse. In this work, we propose a multi-tile ReRAM accelerator for supporting multiple CNN topologies, where each tile processes one or more layers in a pipelined fashion. Building upon the fact that a tile with large receptive field can be built with a stack of smaller (3×3) filters, we design every tile with 9 processing elements that operate in a systolic fashion. Use of systolic data flow design maximizes input feature map reuse and minimizes interconnection cost. We show that 1-bit weight and 4-bit activation achieves good accuracy for both AlexNet and VGGNet, and design our ReRAM based accelerator to support this configuration. System-level simulation results on 32 nm node show that the proposed architecture for AlexNet with stacking small filters can achieve computation efficiency of 8.42 TOPs/s/mm 2 , energy efficiency of 4.08 TOPs/s/W and storage efficiency of 0.18 MB/ mm 2 for inference computation of one image in the CIFAR-100 dataset.
更多
查看译文
关键词
CNN,ReRAM,accelerator,systolic.
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要