Accelerator Design with Effective Resource Utilization for Binary Convolutional Neural Networks on an FPGA

2018 IEEE 26th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM)(2018)

引用 2|浏览10
暂无评分
摘要
In binary convolutional neural networks (BCNN), arithmetic operations are replaced by bitwise operations and the required memory size is greatly reduced, which is a good opportunity to accelerate training or inference on FPGAs. This paper proposes a BCNN architecture with a single engine that achieves high resource utilization. The proposed design deploys a large number of processing elements in parallel to increase throughput, and a forwarding scheme to increase resource utilization on the existing engine. In addition, we demonstrate a novel reuse scheme to make fully-connected layers exploit the same engine. The proposed design is combined with an inference environment for comparison and implemented on a Xilinx XCVU190 FPGA. The implemented design uses 61k look-up tables (LUTs), 45k flip-flops (FFs), and 13.9Mbit block RAM (BRAM). In addition, it achieves 61.6 GOPS/kLUT at 240MHz, which is 1.16 times higher than that of the best prior BCNN design, even though it uses a single engine without optimal configurations on each layer.
更多
查看译文
关键词
Machine learning,Binary convolutional neural networks,High resource utilization,FPGA
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要