Enabling Concurrent Multithreaded Mpi Communication On Multicore Petascale Systems

EuroMPI'10: Proceedings of the 17th European MPI users' group meeting conference on Recent advances in the message passing interface(2010)

引用 40|浏览26
暂无评分
摘要
With the ever-increasing numbers of cores per node on HPC systems, applications are increasingly using threads to exploit the shared memory within a node, combined with MPI across nodes. Achieving high performance when a large number of concurrent threads make MPI calls is a challenging task for an MPI implementation. We describe the design and implementmion of our solution in MPICH2 to achieve high-performance multithreaded communication on the IBM Blue Gene/P. We use a combination of a multichannel-enabled network interface, fine-grained locks, lock-free atomic operations, and specially designed queues to provide a high degree of concurrent access while still maintaining MPI's message-ordering semantics. We present performance results that demonstrate that our new design improves the multithreaded message rate by a factor of 3.6 compared with the existing implementation on the BG/P. Our solutions are also applicable to other high-end systems that have parallel network access capabilities.
更多
查看译文
关键词
Critical Section, Direct Memory Access, Multiple Thread, Incoming Message, Message Queue
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要