AoCStream: All-on-Chip CNN Accelerator with Stream-Based Line-Buffer Architecture and Accelerator-Aware Pruning

SENSORS(2023)

引用 0|浏览1
暂无评分
摘要
Convolutional neural networks (CNNs) play a crucial role in many EdgeAI and TinyML applications, but their implementation usually requires external memory, which degrades the feasibility of such resource-hungry environments. To solve this problem, this paper proposes memory-reduction methods at the algorithm and architecture level, implementing a reasonable-performance CNN with the on-chip memory of a practical device. At the algorithm level, accelerator-aware pruning is adopted to reduce the weight memory amount. For activation memory reduction, a stream-based line-buffer architecture is proposed. In the proposed architecture, each layer is implemented by a dedicated block, and the layer blocks operate in a pipelined way. Each block has a line buffer to store a few rows of input data instead of a frame buffer to store the whole feature map, reducing intermediate data-storage size. The experimental results show that the object-detection CNNs of MobileNetV1/V2 and an SSDLite variant, widely used in TinyML applications, can be implemented even on a low-end FPGA without external memory.
更多
查看译文
关键词
convolutional neural networks,CNN accelerator,pruning
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要