Characterization of Data Movement Requirements for Sparse Matrix Computations on GPUs

2017 IEEE 24th International Conference on High Performance Computing (HiPC)(2017)

引用 11|浏览22
暂无评分
摘要
Tight data movement lower bounds are known for dense matrix-vector multiplication and dense matrix-matrix multiplication and practical implementations exist on GPUs that achieve performance quite close to the roofline bounds based on operational intensity. For large dense matrices, matrix-vector multiplication is bandwidth-limited and its performance is significantly lower than matrix-matrix multiplication. However, in contrast, the performance of sparse matrix-matrix multiplication (SpGEMM) is generally much lower than that of sparse matrix-vector multiplication (SpMV). In this paper, we use a combination of lower-bounds and upper-bounds analysis of data movement requirements, as well as hardware counter based measurements to gain insights into the performance limitations of existing implementations for SpGEMM on GPUs. The analysis motivates the development of an adaptive work distribution strategy among threads and results in performance enhancement for SpGEMM code on GPUs.
更多
查看译文
关键词
data movement bounds,sparse matrix vector multiplication (SpMV),sparse matrix matrix multiplication (SpGEMM),graph analytics,hypergraph partitioning,GPU computing
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要