ID-cache: instruction and memory divergence based cache management for GPUs

2016 IEEE International Symposium on Workload Characterization (IISWC)(2016)

引用 8|浏览103
暂无评分
摘要
Modern graphic processing units (GPUs) are not only able to perform graphics rendering, but also perform general purpose parallel computations (GPGPUs). It has been shown that the GPU L1 data cache and the on chip interconnect bandwidth are important sources of performance bottlenecks and inefficiencies in GPGPUs. Through this work, we aim to understand the sources of inefficiencies and possible opportunities for more efficient cache and interconnect bandwidth management on the GPUs. We do so by understanding the predictability of reuse behavior and spatial utilization of cache lines using program level information such as the instruction PC, and runtime behavior such as the extent of memory divergence. Through our characterization results, we demonstrate that a) PC, and memory divergence can be used to efficiently bypass zero reuse cache lines from the cache; b) memory divergence information can further be used to dynamically insert cache lines of varying size granularities based on their spatial utilization. Finally, based on the insights derived through our characterization, we design a simple Instruction and memory Divergence cache management method that is able to achieve an average of 71% performance improvement for a wide variety of cache and interconnect sensitive applications.
更多
查看译文
关键词
ID-cache,memory divergence,cache management,GPGPU,graphic processing units,graphics rendering,general purpose parallel computations,GPU L1 data cache,on chip interconnect bandwidth,performance bottlenecks,program level information,instruction PC,runtime behavior,PC,zero reuse cache lines,memory divergence information,size granularities,spatial utilization,interconnect sensitive applications
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要