An Experimental Study of Two-Level Schwarz Domain Decomposition Preconditioners on GPUs

CoRR(2023)

引用 0|浏览15
暂无评分
摘要
The generalized Dryja--Smith--Widlund (GDSW) preconditioner is a two-level overlapping Schwarz domain decomposition (DD) preconditioner that couples a classical one-level overlapping Schwarz preconditioner with an energy-minimizing coarse space. When used to accelerate the convergence rate of Krylov subspace iterative methods, the GDSW preconditioner provides robustness and scalability for the solution of sparse linear systems arising from the discretization of a wide range of partial different equations. In this paper, we present FROSch (Fast and Robust Schwarz), a domain decomposition solver package which implements GDSW-type preconditioners for both CPU and GPU clusters. To improve the solver performance on GPUs, we use a novel decomposition to run multiple MPI processes on each GPU, reducing both solver's computational and storage costs and potentially improving the convergence rate. This allowed us to obtain competitive or faster performance using GPUs compared to using CPUs alone. We demonstrate the performance of FROSch on the Summit supercomputer with NVIDIA V100 GPUs, where we used NVIDIA Multi-Process Service (MPS) to implement our decomposition strategy. The solver has a wide variety of algorithmic and implementation choices, which poses both opportunities and challenges for its GPU implementation. We conduct a thorough experimental study with different solver options including the exact or inexact solution of the local overlapping subdomain problems on a GPU. We also discuss the effect of using the iterative variant of the incomplete LU factorization and sparse-triangular solve as the approximate local solver, and using lower precision for computing the whole FROSch preconditioner. Overall, the solve time was reduced by factors of about $2\times$ using GPUs, while the GPU acceleration of the numerical setup time depend on the solver options and the local matrix sizes.
更多
查看译文
关键词
domain decomposition solver package,energy-minimizing coarse space,Fast and Robust Schwarz,FROSch preconditioner,GDSW-type preconditioners,generalized Dryja-Smith-Widlund preconditioner,GPU acceleration,local overlapping subdomain problems,local solver,multiple MPI processes,NVIDIA multiprocess service,NVIDIA V100 GPU,one-level overlapping Schwarz preconditioner,partial different equations,robustness,sparse linear systems,two-level overlapping Schwarz domain decomposition preconditioner
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要