DRAM Bandwidth and Latency Stacks: Visualizing DRAM Bottlenecks

2022 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS)（2022）

引用 0|浏览26

暂无评分

摘要

For memory-bound applications, memory bandwidth utilization and memory access latency determine performance. DRAM specifications mention the maximum peak bandwidth and uncontended read latency, but this number is never achieved in practice. Many factors impact the actually achieved bandwidth, and it is often not obvious to hardware architects or software developers how higher bandwidth usage, and thus higher performance, can be achieved. Similarly, latency is impacted by numerous technology constraints and queueing in the memory controller.DRAM bandwidth stacks intuitively visualize the memory bandwidth consumption of an application and indicate where potential bandwidth is lost. The top of the stack is the peak bandwidth, while the bottom component shows the actually achieved bandwidth. The other components show how much bandwidth is wasted on DRAM refresh, precharge and activate commands, or because of (parts of) the DRAM chip being idle when there are no memory operations available. DRAM latency stacks show the average latency of a memory read operation, divided into base read time, row conflict, and multiple queue components. DRAM bandwidth and latency stacks are complementary to CPI stacks and speedup stacks, providing additional insight to optimize the performance of an application or to improve the hardware.

查看译文

关键词

memory-bound applications,memory bandwidth utilization,DRAM specifications,maximum peak bandwidth,hardware architects,software developers,higher bandwidth usage,memory controller,DRAM bandwidth stacks,memory bandwidth consumption,activate commands,DRAM chip,memory operations,DRAM latency stacks,average latency,CPI stacks,speedup stacks,DRAM bottlenecks

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要