SqueezeJet-3: An HLS-based Accelerator for Edge CNN Applications on SoC FPGAs

ICAT(2023)

引用 1|浏览4
暂无评分
摘要
Most FPGA-based Convolutional Neural Network (CNN) hardware accelerators target the datacenter rather than edge processing units. To further fill this gap, this work presents SqueezeJet-3 and the corresponding design flow of a novel FPGA-based embedded system, consisting of software and hardware for accelerating edge CNN inference. SqueezeJet-3 is optimized for accelerating small ImageNet class CNNs, such as SqueezeNet vl.l and ZynqNet, on low-end low-cost SoC FPGA devices. SqueezeJet-3 is evaluated against the DietChai accelerator, which is part of Xilinx's ChaiDNN v2 framework, in terms of performance, resource utilization, power, and accuracy; the results demonstrate that for the acceleration of SqueezeNet vl.l, SqueezeJet-3 is better than DietChai in all categories. Our evaluation results also show that, by using the presented design framework, a developer can implement FPGA accelerators for larger CNNs, such as the VGG16, with similar performance to the accelerators designed by Angel-Eye and fpgaConvNet frameworks which are optimized for VGG16-like CNN networks.
更多
查看译文
关键词
Algorithm-to-HLS Workflow,High-Level Synthesis,FPGA CNN Accelerator,Deep Learning Application,Mobile Embedded Systems
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要