Compiler-Directed Transformation for Higher-Order Stencils

International Parallel & Distributed Processing Symposium(2015)

引用 44|浏览139
暂无评分
摘要
As the cost of data movement increasingly dominates performance, developers of finite-volume and finite-difference solutions for partial differential equations (PDEs) are exploring novel higher-order stencils that increase numerical accuracy and computational intensity. This paper describes a new compiler reordering transformation applied to stencil operators that performs partial sums in buffers, and reuses the partial sums in computing multiple results. This optimization has multiple effect son improving stencil performance that are particularly important to higher-order stencils: exploits data reuse, reduces floating-point operations, and exposes efficient SIMD parallelism to backend compilers. We study the benefit of this optimization in the context of Geometric Multigrid (GMG), a widely used method to solvePDEs, using four different Jacobi smoothers built from 7-, 13-, 27- and 125-point stencils. We quantify performance, speedup, andnumerical accuracy, and use the Roofline model to qualify our results. Ultimately, we obtain over 4× speedup on the smoothers themselves and up to a 3× speedup on the multigrid solver. Finally, we demonstrate that high-order multigrid solvers have the potential of reducing total data movement and energy by several orders of magnitude.
更多
查看译文
关键词
Compiler Optimization, Stencil, High-Order, Multigrid, Mehrstellen
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要