SelSMaP: A Selective Stride Masking Prefetching Scheme.

ICCD(2019)

引用 5|浏览74
暂无评分
摘要
Data prefetching, which intelligently loads data closer to the processor before demands, is a popular cache performance optimization technique to address the increasing processor-memory performance gap. Although prefetching concepts have been proposed for decades, sophisticated system architecture and emerging applications introduce new challenges. Large instruction windows coupled with out-of-order execution makes the program data access sequence distorted from a cache perspective. Furthermore, big data applications stress memory subsystems heavily with their large working set sizes and complex data access patterns. To address such challenges, this work proposes a high-performance hardware prefetching scheme, SelSMaP. SelSMaP is able to detect both regular and nonuniform stride patterns by taking the minimum observed address offset (called a reference stride) as a heuristic. A stride masking is generated according to the reference stride and is to filter out history accesses whose pattern can be rephrased as uniform stride accesses. Prefetching decision and prefetch degree are determined based on the masking outcome. As SelSMaP prediction logic does not rely on the chronological order of data accesses or program counter information, it is able to unveil the effect of out-of-order execution and compiler optimization. We evaluated SelSMaP with CloudSuite workloads and SPEC CPU2006 benchmarks. SelSMaP achieves an average CloudSuite performance improvement of 30% over nonprefetching systems. With one to two orders of magnitude less storage and much less functional logic, SelSMaP outperforms the highest-performing prefetcher by 8.6% in CloudSuite workloads.
更多
查看译文
关键词
Prefetching, cache, memory subsystem
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要