SparG: A Sparse GEMM Accelerator for Deep Learning Applications

Algorithms and Architectures for Parallel Processing Lecture Notes in Computer Science(2022)

引用 0|浏览3
暂无评分
摘要
Deep learning has become a hot field of research. Previously, the deep learning algorithms were mainly run by the CPU and GPU. With the rapid development of deep learning, it has been found that the previous processors can no longer carry the specific large-scale calculations of deep learning, and customized accelerators of deep learning have become popular. The main workload of most deep learning is the General Matrix-matrix Multiplication (GEMM), and emerging GEMM are highly sparse and irregular. The TPU and SIGMA are state-of-the-art GEMM accelerators in recent years, but the TPU does not support sparsity, and the SIGMA has insufficient utilization in some Processing Elements (PEs). In this paper, we design and implement the SparG, a flexible sparse GEMM accelerator. The SparG has a specific PE structure, a flexible distribution network, and an efficient reduction network. For sparse and irregular GEMMs, the SparG can maintain high utilization of PEs while taking advantage of sparsity. We run sparse and irregular GEMMs in the TPU, SIGMA, and SparG. The experimental results show that the performance of the SparG is the highest (30x better than the TPU, and 3.6x better than the SIGMA), and the SparG brings only a small amount of additional hardware overhead (~20% more than the TPU, and ~10% more than the SIGMA).
更多
查看译文
关键词
sparse gemm accelerator,deep learning,deep learning applications
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要