A Vector Systolic Accelerator for Multi-Precision Floating-Point High-Performance Computing

IEEE Transactions on Circuits and Systems II: Express Briefs(2022)

引用 4|浏览49
暂无评分
摘要
There is an emerging need to design multi-precision floating-point (FP) accelerators for high-performance-computing (HPC) applications. However, the existing multi-precision design using high-precision-split method and low-precision-combination method suffers either low hardware utilization rate and long multiple clock-cycle processing period, respectively. In this paper, a new pipelined multi-precision FP processing element (PE) is developed with proposed redundancy-minimized bit-partitioning method. 3.8× throughput improving is achieved by the elaborate designed pipeline. Besides, vector systolic structure is designed for PE array to increase the system-level throughput and energy efficiency. The proposed design is realized in a 28-nm process with 1.351-GHz clock frequency. Compared with the existing multi-precision FP methods, the proposed work exhibits the best energy-efficiency performance of 1193 GFLOPSIW at FP16, 317 GFLOPS/W at FP32 and 77.3 GFLOPS/W at FP64 with at least 22.3%, 30% and 3.3% improvement, respectively.
更多
查看译文
关键词
Multi-precision,floating-point,PE,MAC,vector,systolic,HPC,accelerator
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要