Exhaustive Evaluation Of Memory-Latency Sensitivity On Manycore Processors With Large Cache

2018 2ND INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPILATION, COMPUTING AND COMMUNICATIONS (HP3C 2018)(2018)

引用 3|浏览1
暂无评分
摘要
The launch of DIMM type 3D XPoint is planned in 2018, and machines that have such devices as large main memory will be commodity in the near future. It is important to evaluate application performance beforehand on those machine configurations, considering the effects of larger main memory latency. The objective of this paper is to propose an accurate and high-throughput evaluation methodology for exhaustive experiments to evaluate with lots of applications with various multidimensional conditions. Also the target architecture is manycore processors such as Xeon Phi KNL and assumes they have large DRAM cache in addition to 3D XPoint main memory. In order to evaluate the latency effects accurately, it is necessary to take stall cycles caused by main memory accesses into account. However, using cycle accurate simulators is too heavy. Instead, we adopt to harness performance counters of processors. However, the current Xeon Phi KNL does not have any performance counters for the stalls. To address this issue, our method integrates measurement results on Xeon Skylake-SP, which have desirable performance counters and close memory system to that of KNL. The paper shows results of exhaustive experiments, which take two days with the proposed method considering arbitrary latency settings. With a cycle accurate simulator, the equivalent experiments would take about 180 years per latency setting.
更多
查看译文
关键词
Performance evaluation, Benchmarking, Nonvolatile memory, 3D Xpoint, DRAM cache memory, Manycore, Multithread
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要