Scalable Tracing of MPI Events and Performance Metrics

IPDPS Workshops(2023)

引用 0|浏览22
暂无评分
摘要
Tracing is a basic approach to analyzing performance and understanding MPI program behavior patterns. However, MPI event trace requires increasingly large storage space as the parallel scale grows. Besides MPI event trace, many performance analysis tasks (e.g., performance variance detection, proxy synthesis) also require detailed runtime performance metrics, which further aggravates the storage issue. In this paper, we propose a scalable tracing tool to effectively record and compress MPI event trace and related runtime performance metrics. The tool analyzes the data redundancy caused by loops and SPMD (single program multiple data) property of MPI programs. According to the analysis, the tool can compactly reorganize and store the data. Compared with existing trace compression methods, our tool can achieve generally higher compression ratio and less time cost.
更多
查看译文
关键词
High performance computing,MPI,trace,compression,performance analysis
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要