High-Performance Architecture Aware Sparse Convolutional Neural Networks for GPUs.

PACT(2022)

引用 0|浏览48
暂无评分
摘要
Convolutional Neural Networks (CNN) are used to analyze data with spatial/temporal structure. In recent years, CNN's popularity has increased exponentially by virtue of its accuracy and applicability. Due to its massive deployment scale, especially in the automotive industry, image analytics, and portable devices, even fractional improvement in performance and power consumption can lead to enormous savings. In this work, we focus on exploiting the sparsity of feature maps and reducing the required number of computations and data movement, leading to improved performance. Compared to kernel sparsity, where the sparsity structure is known apriori, the feature map sparsity is only known during runtime, making this a challenging optimization problem, especially for GPUs. In this paper, we develop a GPU-friendly Sparse CNN framework capable of handling feature map sparsity. The efficacy of our approach is demonstrated by comparing the performance of our implementation with the state-of-the-art implementations. Our approach can also be extended to support upcoming techniques such as feature map pruning and submanifold sparse convolutional Networks.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要