Edram-Cim: Compute-In-Memory Design With Reconfigurable Embedded-Dynamic-Memory Array Realizing Adaptive Data Converters And Charge-Domain Computing

2021 IEEE INTERNATIONAL SOLID-STATE CIRCUITS CONFERENCE (ISSCC)(2021)

引用 73|浏览13
暂无评分
摘要
The unprecedented growth in deep neural networks (DNN) size has led to massive amounts of data movement from off-chip memory to on-chip processing cores in modern machine learning (ML) accelerators. Compute-in-memory (CIM) designs performing analog DNN computations within a memory array, along with peripheral mixed-signal circuits, are being explored to mitigate this memory-wall bottleneck: consisting of memory latency and energy overhead. Embedded-dynamic random-access memory (eDRAM) [1], [2], which integrates the 1T1C (T=Transistor, C=Capacitor) DRAM bitcell monolithically along with high-performance logic transistors and interconnects, can enable custom CIM designs. It offers the densest embedded bitcell, a low pJ/bit access energy, a low soft error rate, high-endurance, high-performance, and high-bandwidth: all desired attributes for ML accelerators. In addition, the intrinsic charge sharing operation during a dynamic memory access can be used effectively to perform analog CIM computations: by reconfiguring existing eDRAM columns as charge domain circuits, thus, greatly minimizing peripheral circuit area and power overhead. Configuring a part of eDRAM as a CIM engine (for data conversion, DNN computations, and weight storage) and retaining the remaining part as a regular memory (for inputs, gradients during training, and non-CIM workload data) can help to meet the layer/kernel dependent variable storage needs during a DNN inference/training step. Thus, the high cost/bit of eDRAM can be amortized by repurposing part of existing large capacity, level-4 eDRAM caches [7] in high-end microprocessors, into large-scale CIM engines.
更多
查看译文
关键词
eDRAM-CIM,reconfigurable embedded-dynamic-memory array,adaptive data converters,off-chip memory,on-chip processing cores,modern machine learning accelerators,analog DNN computations,mixed-signal circuits,memory-wall bottleneck,energy overhead,random-access memory,high-performance logic transistors,custom CIM designs,densest embedded bitcell,low soft error rate,ML accelerators,intrinsic charge sharing operation,dynamic memory access,analog CIM computations,charge domain circuits,peripheral circuit area,CIM engine,data conversion,regular memory,nonCIM workload data,level-4 eDRAM caches,high-end microprocessors,large-scale CIM engines,compute-in-memory design,charge-domain computing,deep neural network size,eDRAM columns,compute-in-memory designs,DRAM bitcell,weight storage,layer-kernel dependent variable storage,DNN inference-training step
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要