Lessons Learned on MPI+Threads Communication

SC22: International Conference for High Performance Computing, Networking, Storage and Analysis(2022)

引用 0|浏览3
暂无评分
摘要
Hybrid MPI+threads programming is gaining prominence, but, in practice, applications perform slower with it compared to the MPI everywhere model. The most critical challenge to the parallel efficiency of MPI+threads applications is slow MPI_THREAD_MULTIPLE performance. MPI libraries have recently made significant strides on this front, but to exploit their capabilities, users must expose the communication parallelism in their MPI+threads applications. Recent studies show that MPI 4.0 provides users with new performance-oriented options to do so, but our evaluation of these new mechanisms shows that they pose several challenges. An alternative design is MPI Endpoints. In this paper, we present a comparison of the different designs from the perspective of MPI's end-users: domain scientists and application developers. We evaluate the mechanisms on metrics beyond performance such as usability, scope, and portability. Based on the lessons learned, we make a case for a future direction.
更多
查看译文
关键词
Exascale computing,Message-oriented middleware,MPI Endpoints,MPI+OpenMP,MPI+threads,MPI_THREAD_MULTIPLE,Partitioned communication
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要