Software combining to mitigate multithreaded MPI contention

Abdelhalim Amer,Charles Archer,Michael Blocksome,Chongxiao Cao,Michael Chuvelev,Hajime Fujita,Maria Garzaran,Yanfei Guo,Jeff R. Hammond,Shintaro Iwasaki,Kenneth J. Raffenetti, Mikhail Shiryaev,Min Si,Kenjiro Taura,Sagar Thapaliya,Pavan Balaji

Proceedings of the ACM International Conference on Supercomputing（2019）

引用 11|浏览128

暂无评分

摘要

Efforts to mitigate lock contention from concurrent threaded accesses to MPI have reduced contention through fine-grained locking, avoided locking altogether by offloading communication to dedicated threads, or alleviated negative side effects from contention by using better lock management protocols. The blocking nature of lock-based methods, however, wastes the asynchrony benefits of nonblocking MPI operations, and the offloading model sacrifices CPU resources and incurs unnecessary software offloading overheads under low contention. We propose new thread safety models, CSync and LockQ, based on software combining, a form of software offloading without the requirement for dedicated threads; a thread holding the lock combines work of threads that failed their lock acquisitions. We demonstrate that CSync, a direct application of software combining, improves scalability but suffers from lack of asynchrony and incurs unnecessary offloading. LockQ alleviates these shortcomings by leveraging MPI semantics to relax synchronization and reduce offloading requirements. We present the implementation, analysis, and evaluation of these models on a modern network fabric and show that LockQ outperforms most existing thread safety models in low- and high-contention regimes.

查看译文

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要