Reconfigurable Acceleration of 3D-CNNs for Human Action Recognition with Block Floating-Point Representation

2018 28th International Conference on Field Programmable Logic and Applications (FPL)(2018)

引用 24|浏览21
暂无评分
摘要
Human action recognition (HAR) has been widely employed in various applications such as autonomous cars and intelligent video surveillance. Among the algorithms proposed for HAR, the 3D-CNNs algorithm can achieve the best accuracy. However, its algorithmic complexity imposes a substantial overhead over the speed of these networks, which limits their deployment in real-life applications. This paper proposes a novel customizable architecture for 3D-CNNs based on block floating-point (BFP) arithmetic, where the utilization of BFP significantly reduces the bitwidth while eliminating the need to retrain the network. Optimizations such as locality exploration and block alignment with 3D blocking are performed to improve performance and accuracy. An analytical model and tool are developed to predict the optimized parameters for hardware customization based on user constraints such as FPGA resources or accuracy requirement. Experiments show that without retraining, a 15-bit mantissa design using single-precision accumulation on a Xilinx ZC706 device can be 8.2 times faster than an Intel i7-950 processor at 3.07 GHz with only 0.4% accuracy loss.
更多
查看译文
关键词
deep learning,FPGA,3D CNN,Human action recognition
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要