Improving the Efficiency of GPGPU Work-Queue Through Data Awareness.

TACO(2017)

引用 3|浏览28
暂无评分
摘要
The architecture and programming model of current GPGPUs are best suited for applications that are dominated by structured control and data flows across large regular datasets. Parallel workloads with irregular control and data structures cannot easily harness the processing power of the GPGPU. One approach for mapping these irregular-parallel workloads to GPGPUs is using work-queues. The work-queue approach improves the utilization of SIMD units by only processing useful works that are dynamically generated during execution. As current GPGPUs lack necessary supports for work-queues, a software-based work-queue implementation often suffers from memory contention and load balancing issues. In this article, we present a novel hardware work-queue design named DaQueue, which incorporates three data-aware features to improve the efficiency of work-queues on GPGPUs. We evaluate our proposal on the irregular-parallel workloads and carry out a case study on a path tracing pipeline with a cycle-level simulator. Experimental results show that for the tested workloads, DaQueue improves performance by 1.53× on average and up to 1.91×. Compared to a hardware worklist approach that is the state-of-the-art prior work, DaQueue can achieve an average of 33.92% extra speedup with less hardware area cost.
更多
查看译文
关键词
GPGPU, data awareness, work-queue
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要