On-Chip Instruction Generation for Cross-Layer CNN Accelerator on FPGA

2019 IEEE Computer Society Annual Symposium on VLSI (ISVLSI)(2019)

引用 4|浏览71
暂无评分
摘要
Convolutional neural networks (CNN) are gaining popularity in the field of computer vision. CNN-based methods are computational-intensive and resource-consuming, thus are hard to be integrated into embedded systems and applied to real-time task scenarios. Many FPGA based CNN accelerators have been proposed to get higher performance. Cross-layer CNN accelerator is designed to reduce the data transfer by fusing several layers. However, the instruction size that needs to be transferred is usually considerable, leading to a performance drop of cross-layer accelerators. In this study, we develop an on-chip instruction generation method based on the cross-layer accelerator to reduce the total instruction size transferred to the chip. We design the corresponding hardware module and modify existing object detection models according to the hardware structure to improve the accuracy of object detection tasks. The evaluation results show that in the same calculation process, our accelerator can achieve 35% data transfer reduction on the VGG16 network. The average instruction size and compilation time are reduced by 95% using our instruction generation method. The performance of the accelerator reaches 1414 GOP/s.
更多
查看译文
关键词
FPGA,CNN,ISA,Cross-Layer,Object Detection
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要