Assessing the Memory Wall in Complex Codes

2022 IEEE/ACM Workshop on Memory Centric High Performance Computing (MCHPC)(2022)

引用 0|浏览8
暂无评分
摘要
Many of Los Alamos National Laboratory’s (LANL) High Performance Computing (HPC) codes are heavily memory bandwidth bound. These codes often exhibit high levels of sparse memory access which differ significantly from industry standard benchmarks such as STREAM and GUPS. In this paper we present an analysis of some of our most important code-bases and their memory access patterns. From this analysis we generate representative micro-benchmarks that preserve the memory access characteristics of our codes using two approaches, one based on statistical sampling of relative memory offsets in a sliding time window at the function level and another at the loop level. The function level approach is used to assess the impact of advanced memory technologies such as LPDDR5 and HBM3 using the gem5 [1] simulator. Our simulation results show significant improvements for sparse memory access workloads using HBM3 relative to LPDDR5 and better scaling on a per core basis. Assessment of two different CPU architectures show that significantly higher peak memory bandwidth results in high bandwidth on sparse workloads. These two assessments demonstrate the benefits of this workload characterization technique in memory system design and evaluation.
更多
查看译文
关键词
hardware architecture,memory technology,workload analysis
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要