Doubly-Block Circulant Kernel Matrix Exploitation in Convolutional Accelerators.

Midwest Symposium on Circuits and Systems(2023)

引用 0|浏览1
暂无评分
摘要
In this paper, we present a novel algorithmic and hardware co-design approach specifically tailored for efficient 2D convolution implementations, a crucial operation in convolutional neural networks (CNNs). Our method addresses the limitations of existing software-based solutions and hardware-based architectures, delivering significant improvements in asymptotic behavior for generic convolution cases. By leveraging the distinctive geometry of doubly block circulant unrolled kernel matrices, our approach eliminates the need for input and weight buffers, optimizes output memory usage, and minimizes redundant memory accesses. A comprehensive comparative analysis with state-of-the-art techniques showcases the key advantages and superior performance of our proposed method, achieving substantial reductions in memory requirements and high throughput.
更多
查看译文
关键词
2D Convolution,Systolic Array,Unrolled Kernel Matrix,Doubly-Blocked Circulant Matrix
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要