More AddNet: A deeper insight into DNNs using FPGA-optimized multipliers.

ISCAS(2023)

引用 0|浏览3
暂无评分
摘要
We present a training tool flow for deep neural networks (DNN) optimized for a hardware-efficient FPGA-implementation based on reconfigurable constant-coefficient multipliers (RCCMs). RCCMs replace the costly generic multipliers by shift-and-add operations. In previous work, it was shown that RCCMs offer a better alternative for saving FPGA area than utilizing low-precision arithmetic. This work proposes an improved tool flow that enables layer-wise weight quantization, a larger search space by additional RCCM coefficient sets and an optimized retraining. This leads to an improved accuracy compared to the previous method. In addition, hardware requirements are lower as only 1 to 3 adders per multiplication are used. This reduces the overall complexity and the required memory bandwidth simultaneously. We evaluate our tool flow using multiple networks (ResNets) on the ImageNet data set.
更多
查看译文
关键词
FPGA,Neural Network Accelerator,Arithmetic,Neural Network Training,Tool flow,Fixed-Point,Low-Precision
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要