A 28-nm Floating-Point Computing-in-Memory Processor Using Intensive-CIM Sparse-Digital Architecture

Shengzhe Yan,Jinshan Yue,Chaojie He,Zi Wang, Zhaori Cong,Yifan He, Mufeng Zhou, Wenyu Sun,Xueqing Li,Chunmeng Dou,Feng Zhang,Huazhong Yang,Yongpan Liu,Ming Liu

IEEE JOURNAL OF SOLID-STATE CIRCUITS（2024）

引用 0|浏览8

暂无评分

摘要

Computing-in-memory (CIM) chips have demonstrated promising high energy efficiency on multiply-accumulate (MAC) operations for artificial intelligence (AI) applications. Though integral (INT) CIM chips are emerging, the floating-point (FP) CIM chip has not been well explored. The high-accuracy demand of larger models and complex tasks requires FP computation. Besides, most of the neural network (NN) training tasks still rely on FP computation. This work presents an energy-efficient FP CIM processor. It is observed that most of the exponent values of FP data are concentrated in a small region. Therefore, the FP computations are divided into intensive and sparse parts and then executed on an intensive-CIM sparse-digital architecture. First, an FP-to-INT CIM workflow for the intensive FP operations is designed to reduce the CIM execution cycles. Second, a flexible sparse-digital core is proposed for the remaining sparse FP operations. Utilizing both the intensive-CIM and sparse-digital cores, this work can achieve both high energy efficiency and identical accuracy to the FP algorithm baseline. Considering the FP CIM execution flow, a CIM-friendly low-bit FP training method is proposed to further reduce the execution cycles. Besides, a low-MAC-value (MACV) CIM macro is designed to utilize the more random sparsity brought by FP alignment. The 28-nm fabricated chip shows 275-1615-TOPS/W@INT4 and 17.2-91.3-TOPS/W@FP16 macro energy efficiency from dense to the average sparsity on the tested models.

查看译文

关键词

Computing-in-memory (CIM),floating point (FP),high energy efficiency,neural network (NN) processor,sparsity

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要