Parallel Sparse Matrix-Vector And Matrix-Transpose-Vector Multiplication Using Compressed Sparse Blocks

SPAA(2009)

引用 527|浏览336
暂无评分
摘要
This paper introduces a storage format for sparse matrices, called compressed sparse blocks (CSB), which allows both Ax and A(x)(inverted perpendicular) to be computed efficiently in parallel, where A is an n x n sparse matrix with nnz >= n nonzeros and x is a dense n-vector. Our algorithms use Theta(nnz) work (serial running time) and Theta(root nlgn) span (critical-path length), yielding a parallelism of Theta(nnz/root nlgn), which is amply high for virtually any large matrix. The storage requirement for CSB is esssentially the same as that for the more-standard compressed-sparse-rows (CSR) format, for which computing Ax in parallel is easy but A(x)(inverted perpendicular) is difficult. Benchmark results indicate that on one processor, the CSB algorithms for Ax and A(x)(inverted perpendicular) run just as fast as the CSR algorithm for Ax, but the CSB algorithms also scale up linearly with processors until limited by off-chip memory bandwidth.
更多
查看译文
关键词
Compressed sparse blocks,compressed sparse columns,compressed sparse rows,matrix transpose,matrix-vector multiplication,multithreaded algorithm,parallelism,span,sparse matrix,storage format,work
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要