High Density Pipelined 8bit Multiplier Systolic Arrays for FPGA.

FPGA(2020)

引用 5|浏览25
暂无评分
摘要
With the advent of AI and machine learning as the highest profile FPGA applications, INT8 performance is currently one of the key benchmarking metrics. In current devices, INT8 multipliers must be extracted from higher precision multipliers. Recently, we reported the implementation of a mixed DSP Block and soft logic design, with 22,400 INT8 multipliers, and a system clock rate of 416MHz, on the Intel Stratix 10 2800 chip. In this paper we demonstrate alternate techniques for integer multiplier construction to better balance the resource types on current FPGAs - logic, memory, and DSP - to make a significant improvement in the multiplier, and therefore the dot product, density. We further extend these techniques to 8 bit signed-magnitude (SM) 1.7 representation, which can further improve arithmetic density by using the logic and memory resources more flexibly. We describe variable composition dot product structures, which can be assembled in a scalable 2D systolic array. In one example, we report a design containing 32,768 SM1.7 multipliers, with a clock rate of 432MHz, giving a system performance of over 28 TOPs. Our INT8 densities are improved by up to 30% over the earlier work - we show one design with 28,800 INT8 multipliers. In all cases, enough device resources are left free and accessible to implement a full application level design.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要