Common Subexpression Convergence: A New Code Optimization for SIMT Processors

LANGUAGES AND COMPILERS FOR PARALLEL COMPUTING, LCPC 2019(2021)

引用 3|浏览9
暂无评分
摘要
On SIMT processors, when threads in a warp diverge at a branch, the hardware scheduler serializes their execution, thereby resulting in reduced SIMT efficiency. We propose a new compiler optimization, Common Subexpression Convergence (CSC), that uses cross-block scheduling to ensure that expression trees that are common across diverged paths are moved to convergent regions and executed by more/all threads in parallel, thereby improving SIMT efficiency and execution time. Our optimization framework is based on a dynamic programming algorithm for finding maximally profitable common expression subgraphs. We also introduce a general approach to test the legality of our optimization based on the program dependence graph, and a heuristic-based cost model to decide when the optimization should be applied. We demonstrate the potential benefits of our approach through a preliminary hand-optimized evaluation using synthetic benchmarks and a BitonicSort example program.
更多
查看译文
关键词
SIMT processors,Thread divergence,Compiler optimizations,GPUs
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要