High Density 8-Bit Multiplier Systolic Arrays For Fpga

2020 IEEE 28th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM)(2020)

引用 10|浏览24
暂无评分
摘要
Artificial Intelligence (AI) has become the fastest growing application area for FPGAs. Two types of numerics are needed. Training typically uses floating point arithmetic (which is now widely available as embedded functions in current FPGAs). Inference is typically calculated with lower precision integer numbers, which can be implemented with embedded functions, soft logic, or a combination of the two. INT8 performance is therefore used as a typical benchmarking metric for current FPGAs. Recent publications based on Xilinx devices show the extraction of two INT8 multipliers from a 24×18 multiplier. A paper from Intel describes how to obtain two INT8 multipliers from a 18×18 multiplier, with the help of a small amount of soft logic. In this paper we introduce a number of new INT8 multiplier techniques, starting with the Intel 18×18 multiplier approach. Using both memory and logic resources - for a more balanced use of the FPGA features - we improve the INT8 density, and also show a signed-magnitude (SM) 1.7 construct that is even smaller. To demonstrate the usability of these new multipliers, we develop a scalable systolic array, that contains up to 32,768 SM1.7 multipliers, or 28,800 INT8 multipliers, fit in an Intel Stratix 10 2800 device. Finally, we implement a system architecture that includes input and output flow buffering and control, which can be instantiated directly into a larger AI design, or can enable the FPGA to be used as a standalone accelerator. This system exceeds 400 MHz for the largest array on a mid-speed device (26 TOPS INT8), and can operate up to 600 MHz for smaller array sizes.
更多
查看译文
关键词
embedded functions,FPGAs,soft logic,INT8 performance,INT8 multiplier techniques,Intel 18×18 multiplier approach,8-bit multiplier systolic arrays,AI design,systolic array,artificial intelligence,floating point arithmetic,Xilinx devices,logic resources
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要