A 7-nm Four-Core Mixed-Precision AI Chip With 26.2-TFLOPS Hybrid-FP8 Training, 104.9-TOPS INT4 Inference, and Workload-Aware Throttling

IEEE Journal of Solid-State Circuits（2022）

引用 11|浏览66

暂无评分

摘要

Reduced precision computation is a key enabling factor for energy-efficient acceleration of deep learning (DL) applications. This article presents a 7-nm four-core mixed-precision artificial intelligence (AI) chip that supports four compute precisions—FP16, Hybrid-FP8 (HFP8), INT4, and INT2—to support diverse application demands for training and inference. The chip leverages cutting-edge algorithm...

查看译文

关键词

Training,Artificial intelligence,AI accelerators,Inference algorithms,Computer architecture,Bandwidth,System-on-chip

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要