Optimizing 3 D Multigrid to be Comparable with the FFT
msra
摘要
Multigrid is an iterative algorithm that uses nearestneighbor computations to solve PDEs on a regular mesh. Although multigrid is an algorithm, on problem sizes of interest it generally does not run as fast as the highly optimized FFT, which is an algorithm. This paper examines several optimizations to the multigrid approach for solving PDEs on a three-dimensional grid. The NAS Multigrid (NASMG) benchmark, a recognized standard, serves as a baseline for our performance analysis. We present a performance model for 3D multigrid that incorporates architectural parameters for the processor and memory system. Benchmark results for potential optimizations are optained on multiple architectures. The Performance Application Programming Interface (PAPI) library is used to examine hardware behavior in detail. We compare these results with the predictions of our performance model and discuss the architectural requirements for further improving the speed of 3D multigrid.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络