HPIPE - Heterogeneous Layer-Pipelined and Sparse-Aware CNN Inference for FPGAs.

FPGA(2020)

引用 25|浏览35
暂无评分
摘要
This poster presents a novel cross-layer-pipelined Convolutional Neural Network accelerator architecture, and network compiler, that make use of precision minimization and parameter pruning to fit ResNet-50 entirely into on-chip memory on a Stratix 10 2800 FPGA. By statically partitioning the hardware across each of the layers in the network, our architecture enables full DSP utilization and reduces the soft logic per DSP ratio by roughly 4x over prior work on sparse CNN accelerators for FPGAs. This high DSP utilization, a frequency of 420MHz, and skipping zero weights enable our architecture to execute a sparse ResNet-50 model at a batch size of 1 at 3300 images/s, which is nearly 3x higher throughput than NVIDIA's fastest machine learning targeted GPU, the V100. We also present a network compiler and a flexible hardware interface that make it easy to add support for new types of neural networks, and to optimize these networks for FPGAs with different on-chip resources.
更多
查看译文
关键词
fpgas,layer-pipelined,sparse-aware
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要