Domain-Specific Optimization Of Two Jacobi Smoother Kernels And Their Evaluation In The Ecm Performance Model

PARALLEL PROCESSING LETTERS(2014)

引用 5|浏览7
暂无评分
摘要
Our aim is to apply program transformations to stencil codes in order to yield the highest possible performance. We recognize memory bandwidth as a major limitation in stencil code performance. We conducted a study in which we applied optimizing transformations to two Jacobi smoother kernels: one 3D 1st-order 7-point stencil and one 3D 3rd-order 19-point stencil. To obtain high performance, the optimizations have to be customized for the execution platform at hand. We illustrate this by experiments on two consumer and two server architectures. We also verified the need for complex optimizations with the help of the Execution-Cache-Memory performance model. A code generator with knowledge about stencil codes and execution platforms should be able to apply our transformations automatically. We are working towards such a generator in project ExaStencils.
更多
查看译文
关键词
Jacobi smoothers, high-performance computing, program transformations, stencil codes, ECM performance model
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要