A 1-16b Reconfigurable 80Kb 7T SRAM-Based Digital Near-Memory Computing Macro for Processing Neural Networks

IEEE Transactions on Circuits and Systems I: Regular Papers(2023)

引用 8|浏览4
暂无评分
摘要
This work introduces a digital SRAM-based near-memory compute macro for DNN inference, improving on-chip weight memory capacity and area efficiency compared to state-of-the-art digital computing-in-memory (CIM) macros. A $20\times 256.1$ -16b reconfigurable digital computing near-memory (NM) macro is proposed, supporting a reconfigurable 1-16b precision through the bit-serial computing scheme and the weight and input gating architecture for sparsity-aware operations. Each reconfigurable column MAC comprises $16\times $ custom-designed 7T SRAM bitcells to store 1-16b weights, a conventional 6T SRAM for zero weight skip control, a bitwise multiplier, and a full adder with a register for partial-sum accumulations. $20\times $ parallel partial-sum outputs are post-accumulated to generate a sub-partitioned output feature map, which will be concatenated to produce the final convolution result. Besides, pipelined array structure improves the throughput of the proposed macro. The proposed near-memory computing macro implements an 80Kb binary weight storage in a 0.473mm2 die area using 65nm. It presents the area/energy efficiency of 4329-270.6 GOPS/mm2 and 315.07-1.23TOPS/W at 1-16b precision.
更多
查看译文
关键词
SRAM,vector matrix multiplication,multiply-and-accumulate,PIM,CIM,digital near-memory computing
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要