On the Transformation Optimization for Stencil Computation

ELECTRONICS(2022)

引用 0|浏览6
暂无评分
摘要
Stencil computation optimizations have been investigated quite a lot, and various approaches have been proposed. Loop transformation is a vital kind of optimization in modern production compilers and has proved successful employment within compilers. In this paper, we combine the two aspects to study the potential benefits some common transformation recipes may have for stencils. The recipes consist of loop unrolling, loop fusion, address precalculation, redundancy elimination, instruction reordering, load balance, and a forward and backward update algorithm named semi-stencil. Experimental evaluations of diverse stencil kernels, including 1D, 2D, and 3D computation patterns, on two typical ARM and Intel platforms, demonstrate the respective effects of the transformation recipes. An average speedup of 1.65x is obtained, and the best is 1.88x for the single transformation recipes we analyze. The compound recipes demonstrate a maximum speedup of 1.92x.
更多
查看译文
关键词
stencil computation, loop transformation, loop fusion, loop unroll, performance optimization, HPC
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要