Improve GPGPU Latency Hiding with a Hybrid Recovery Stack and a Window Based Warp Scheduling Policy

High Performance Computing and Communication & 2012 IEEE 9th International Conference Embedded Software and Systems(2012)

引用 0|浏览0
暂无评分
摘要
Branch divergence phenomenon usually has very serious impact on SIMD pipeline's efficiency. However Dynamic Warp Subdivision's branch method utilizes the branch divergence phenomenon to hide memory latency by interleaving issue among all branch paths of a warp. But this method may experience serious over-subdivision problem. So, we propose a hybrid stack mechanism that enables the PDOM stack can issue any ready sub-warps without losing the logical structure of PDOM stack. To maximize our hybrid stack's potential we propose a window based scheduling policy to reinforce the memory latency hiding. The experiment result shows that our window based scheduling policy and the hybrid stack hardware's combination can improve the performance by 10% compared with the baseline configuration with PDOM loose round-robin method and 6.8% over DWS-PC with our window based scheduling policy in our selected 7 benchmark programs.
更多
查看译文
关键词
hybrid recovery stack,pdom loose round-robin method,memory latency,warp scheduling policy,branch method,serious impact,memory latency hiding,branch path,serious over-subdivision problem,branch divergence phenomenon,interleaving issue,dynamic warp subdivision,improve gpgpu latency hiding,scheduling,benchmark testing,switches,pipelines,architecture,logical structure,dynamic scheduling,performance,simd,baseline configuration,gpgpu,hardware
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要