DIMC: 2219TOPS/W 2569F2/b Digital In-Memory Computing Macro in 28nm Based on Approximate Arithmetic Hardware

2022 IEEE International Solid- State Circuits Conference (ISSCC)(2022)

引用 31|浏览2
暂无评分
摘要
In-memory-computing (IMC) SRAM architecture has gained significant attention as it achieves high energy efficiency for computing a convolutional neural network (CNN) model [1]. Recent works investigated the use of analog-mixed-signal (AMS) hardware for high area and energy efficiency [2], [3]. However, AMS hardware output is well known to be susceptible to process, voltage, and temperature (PVT) variations, limiting the computing precision and ultimately the inference accuracy of a CNN. We reconfirmed, through the simulation of a capacitor-based IMC SRAM macro that computes a 256D binary dot product, that the AMS computing hardware has a significant root-mean-square error (RMSE) of 22.5% across the worst-case voltage, temperature (Fig. 16.1.1 top left) and 3-sigma process variations (Fig. 16.1.1 top right). On the other hand, we can implement an IMC SRAM macro using robust digital logic [4], which can virtually eliminate the variability issue (Fig. 16.1.1 top). However, digital circuits require more devices than AMS counterparts (e.g., 28 transistors for a mirror full adder [FA]). As a result, a recent digital IMC SRAM shows a lower area efficiency of 6368F 2 /b (22nm, 4b/4b weight/activation) [5] than the AMS counterpart (1170F 2 /b, 65nm, 1b/1b) [3]. In light of this, we aim to adopt approximate arithmetic hardware to improve area and power efficiency and present two digital IMC macros (DIMC) with different levels of approximation (Fig. 16.1.1 bottom left). Also, we propose an approximation-aware training algorithm and a number format to minimize inference accuracy degradation induced by approximate hardware (Fig. 16.1.1 bottom right). We prototyped a 28nm test chip: for a 1b/1b CNN model for CIFAR-10 and across 0.5-to-1.1V supply, the DIMC with double-approximate hardware (DIMC-D) achieves 2569F 2 /b, 932-2219TOPS/W, 475-20032GOPS, and 86.96% accuracy, while for a 4b/1b CNN model, the DIMC with the single-approximate hardware (DIMC-S) achieves 3814F 2 /b, 458-990TOPS/W (normalized to 1b/1b), 405-19215GOPS (normalized to 1b/1b), and 90.41% accuracy.
更多
查看译文
关键词
DIMC,Digital In-Memory Computing Macro,approximate arithmetic hardware,In-memory-computing SRAM architecture,energy efficiency,convolutional neural network model,CNN,analog-mixed-signal hardware,AMS hardware output,capacitor-based IMC SRAM macro,256D binary dot product,AMS computing hardware,root-mean-square error,robust digital logic,digital circuits,mirror full adder,area efficiency,power efficiency,digital IMC macros,approximation-aware training algorithm,inference accuracy degradation,double-approximate hardware,single-approximate hardware,digital IMC SRAM,DIMC-D,DIMC-S,process voltage and temperature variations,PVT variations,size 28.0 nm,size 22.0 nm,size 65.0 nm,voltage 1.1 V
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要