High Density and Performance Multiplication for FPGA

2018 IEEE 25th Symposium on Computer Arithmetic (ARITH)（2018）

引用 13|浏览12

暂无评分

摘要

Arithmetic based applications are one of the most common use cases for modern FPGAs. Currently, machine learning is emerging as the fastest growth area for FPG As, renewing an interest in low precision multiplication. There is now a new focus on multiplication in the soft fabric - very high-density systems, consisting of many thousands of operations, are the current norm. In this paper we introduce multiplier regularization, which restructures common multiplier algorithms into smaller, and more efficient architectures. The multiplier structure is parameterizable, and results are given for a continuous range of input sizes, although the algorithm is most efficient for small input precisions. The multiplier is particularly effective for typical machine learning inferencing uses, and the presented cores can be used for dot products required for these applications. Although the examples presented here are optimized for Intel Stratix 10 devices, the concept of regularized arithmetic structures are applicable to generic FPGA LUT architectures. Results are compared to Intel Megafunction IP as well as contrasted with normalized representations of recently published results for Xilinx devices. We report a 10% to 35% smaller area, and a more significant latency reduction, in the range of 25% to 50%, for typical inferencing use cases.

查看译文

关键词

machine learning,Intel Stratix devices,soft fabric-very high-density systems,latency reduction,Intel Megafunction IP,generic FPGA LUT architectures,regularized arithmetic structures,dot products,multiplier structure,common multiplier algorithms,multiplier regularization,high-density systems,low precision multiplication,arithmetic based applications,performance multiplication

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要