BrainTTA: A 28.6 TOPS/W Compiler Programmable Transport-Triggered NN SoC

Maarten J. Molendijk,Floran A. M. de Putter,Manil Dev Gomony,Pekka Jaaskelainen,Henk Corporaal

2023 IEEE 41ST INTERNATIONAL CONFERENCE ON COMPUTER DESIGN, ICCD（2023）

引用 0|浏览1

暂无评分

摘要

Accelerators designed for deep neural network (DNN) inference with extremely low operand widths, down to 1-bit, have become popular due to their ability to significantly reduce energy consumption during inference. This paper introduces a compiler-programmable flexible System-on-Chip (SoC) with mixed-precision support. This SoC is based on a Transport-Triggered Architecture (TTA) that facilitates efficient implementation of DNN workloads. By shifting the complexity of data movement from the hardware scheduler to the exposed-datapath compiler, DNN workloads can be implemented in an energy efficient yet flexible way. The architecture is fully supported by a compiler and can be programmed using C/C++/OpenCL. The SoC is implemented using 22nm FDX technology and achieves a peak energy efficiency of 28.6/14.9/2.47 TOPS/W for binary, ternary, and 8-bit precision, respectively, while delivering a throughput of 614/307/77 GOPS. Compared to state-of-the-art (SotA), this work achieves up to 3.3x better energy efficiency compared to other programmable solutions.

查看译文

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要