CIMAT: a transpose SRAM-based compute-in-memory architecture for deep neural network on-chip training

Proceedings of the International Symposium on Memory Systems(2019)

引用 7|浏览11
暂无评分
摘要
Rapid development in deep neural networks (DNNs) is enabling many intelligent applications. However, on-chip training of DNNs is challenging due to the extensive computation and memory bandwidth requirements. To solve the bottleneck of the memory wall problem, compute-in-memory (CIM) approach exploits the analog computation along the bit line of the memory array thus significantly speeds up the vector-matrix multiplications. So far, most of the CIM based prototype chips target at implementing inference engine for offline training only. In this work, we design the data flow for the backpropagation (BP) process and weight update to support the on-chip training based on CIM. We utilize the mature and advanced CMOS technology at 7 nm to design the CIM architecture with 7T transpose SRAM array that supports bidirectional parallel read. We explore the 8-bit training performance of ImageNet on ResNet-18 using this CIM architecture for training, namely CIMAT, showing that it can achieve 3.38× higher energy efficiency (~6.02 TOPS/W), 4.34× frame rate (~4,020 fps) and only 50% chip size compared to the baseline architecture with conventional 6T SRAM array that supports row-by-row read only.
更多
查看译文
关键词
compute-in-memory, deep neural network, on-chip training, transpose SRAM
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要