Distributed-memory hierarchical compression of dense SPD matrices.

SC（2018）

引用 22|浏览21

暂无评分

摘要

We present a distributed memory algorithm for the hierarchical compression of symmetric positive definite (SPD) matrices. Our method is based on GOFMM, an algorithm that appeared in doi:10.1145/3126908.3126921. For many SPD matrices, GOFMM enables compression and approximate matrix-vector multiplication that for many matrices can reach N log N time—as opposed to N² required for a dense matrix. But GOFMM supports only shared memory parallelism. In this paper, we use the message passing interface (MPI) and extend the ideas of GOFMM to the distributed memory setting. We also propose and implement an asynchronous algorithm for faster multiplication. We present different usage scenarios on a selection of SPD matrices that are related to graphs, neural-networks, and covariance operators. We present results on the Texas Advanced Computing Center's “Stampede 2” system. We also compare with the STRUMPACK software package, which, to our knowledge, is the only other available software that can compress arbitrary SPD matrices in parallel. In our largest run, we were able to compress a 67M-by-67M matrix in less than three minutes and perform a multiplication with 512 vectors within 5 seconds on 6,144 Intel “Skylake” cores.

查看译文

关键词

Sparse matrices,Task analysis,Covariance matrices,Skeleton,Symmetric matrices,Approximation algorithms,Kernel

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要