An introspection monitoring library to improve MPI communication time

Emmanuel Jeannot, Richard Sartori

JOURNAL OF SUPERCOMPUTING(2023)

引用 0|浏览5
暂无评分
摘要
In this paper, we describe how to improve communication time of MPI parallel applications with the use of a library that enables to monitor MPI applications and allows for introspection (the program itself can query the state of the monitoring system). Based on previous work, this library is able to see how collective communications are decomposed into point-to-point messages. It also features monitoring sessions that allow suspending and restarting the monitoring, limiting it to specific portions of the code. Experiments show that the monitoring overhead is very small and that the proposed features allow for dynamic and efficient rank reordering enabling up to 2-time reduction of communication parts of some program.
更多
查看译文
关键词
MPI,Monitoring,Communication optimization,HPC
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络