Accelerating Computer Vision Tasks on GPUs using Ramanujan Graph Product Framework.

Furqan Ahmed Shaik, Thejasvi Konduru,Girish Varma,Kishore Kothapalli

COMAD/CODS(2023)

引用 0|浏览1
暂无评分
摘要
Sparse neural networks have been proven to generate efficient and better runtimes when compared to dense neural networks. Acceleration in runtime is better achieved with structured sparsity. However, generating an efficient sparsity structure to maintain both runtime and accuracy is a challenging task. In this paper, we implement the RBGP4 sparsity pattern derived from the Ramanujan Bipartite Graph Product (RBGP) framework on various Computer Vision tasks and test how well it performs w.r.t accuracy and runtime. Using this approach, we generate structured sparse neural networks which has multiple levels of block sparsity that generates good connectivity due to the presence of Ramanujan bipartite graphs. We benchmark our approach on Semantic Segmentation and Pose Estimation tasks on an edge device (Jetson Nano 2GB) as well as server (V100) GPUs. We compare the results obtained for RBGP4 sparsity pattern with the unstructured and block sparsity patterns. When compared to sparsity patterns like unstructured and block, we obtained significant speedups while maintaining accuracy.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要