Distributed O(N) Linear Solver for Dense Symmetric Hierarchical Semi-Separable Matrices

2019 IEEE 13th International Symposium on Embedded Multicore/Many-core Systems-on-Chip (MCSoC)(2019)

引用 27|浏览10
暂无评分
摘要
We present a distributed memory algorithm for the approximate hierarchical factorization of symmetric positive definite (SPD) matrices. Our method is based on the distributed memory GOFMM, an algorithm that appeared in SC18 (doi:10.1109/SC.2018.00018). GOFMM constructs a hierarchical matrix approximation of an arbitrary SPD matrix that compresses the matrix by creating low-rank approximations of the off-diagonal blocks. GOFMM method has no guarantees of success for arbitrary SPD matrices. (This is similar to the SVD; not every matrix admits a good low-rank approximation.) But for many SPD matrices, GOFMM does enable compression that results in fast matrix-vector multiplication that can reach N logN time—as opposed to N2 required for a dense matrix. GOFMM supports shared and distributed memory parallelism. In this paper, we build an approximate "ULV" factorization based on the Hierarchically Semi-Separable (HSS) compression of the GOFMM. This factorization requires O(N) work (given the compressed matrix) and O(N=p) + O(log p) time on p MPI processes (assuming a hypercube topology). The previous state-of-the-art required O(N logN) work. We present the factorization algorithm, discuss its complexity, and present weak and strong scaling results for the "factorization" and "solve" phases of our algorithm. We also discuss the performance of the inexact ULV factorization as a preconditioner for a few exemplary large dense linear systems. In our largest run, we were able to factorize a 67M-by-67M matrix in less than one second; and solve a system with 64 right-hand sides in less than one-tenth of a second. This run was on 6,144 Intel "Skylake" cores on the SKX partition of the Stampede2 system at the Texas Advanced Computing Center.
更多
查看译文
关键词
Hierarchical matrix algorithm,fast linear solver,parallel algorithm
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要