Accelerator Design with Effective Resource Utilization for Binary Convolutional Neural Networks on an FPGA

2018 IEEE 26th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM)（2018）

引用 2|浏览10

暂无评分

摘要

In binary convolutional neural networks (BCNN), arithmetic operations are replaced by bitwise operations and the required memory size is greatly reduced, which is a good opportunity to accelerate training or inference on FPGAs. This paper proposes a BCNN architecture with a single engine that achieves high resource utilization. The proposed design deploys a large number of processing elements in parallel to increase throughput, and a forwarding scheme to increase resource utilization on the existing engine. In addition, we demonstrate a novel reuse scheme to make fully-connected layers exploit the same engine. The proposed design is combined with an inference environment for comparison and implemented on a Xilinx XCVU190 FPGA. The implemented design uses 61k look-up tables (LUTs), 45k flip-flops (FFs), and 13.9Mbit block RAM (BRAM). In addition, it achieves 61.6 GOPS/kLUT at 240MHz, which is 1.16 times higher than that of the best prior BCNN design, even though it uses a single engine without optimal configurations on each layer.

查看译文

关键词

Machine learning,Binary convolutional neural networks,High resource utilization,FPGA

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要