GPU acceleration of extreme scale pseudo-spectral simulations of turbulence using asynchronism

Kiran Ravikumar,David Appelhans,P. K. Yeung

Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis（2019）

引用 48|浏览9

暂无评分

摘要

This paper presents new advances in GPU-driven Fourier pseudo-spectral numerical algorithms, which allow the simulation of turbulent fluid flow at problem sizes beyond the current state of the art. In contrast to several massively parallel petascale systems, the dense nodes of Summit, Sierra, and expected exascale machines can be exploited with coarser MPI decompositions which result in improved MPI all-to-all scaling. An asynchronous batching strategy, combined with the fast hardware connection between the large CPU memory and the fast GPUs allows effective use of the GPUs on problem sizes which are too large to reside in GPU memory. Communication performance is further improved by a hybrid MPI+OpenMP approach. Favorable performance is obtained up to a 184323 problem size on 3072 nodes of Summit, with a GPU to CPU speedup of 4.7 for a 122883 problem size (the largest problem size previously published in turbulence literature).

查看译文

关键词

CUDA, FFT, GPU, MPI, algorithm, all-to-all, asynchronous, communication, distributed, out-of-core, simulations, summit, turbulence

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要