Cool: A Cloud-Optimized Structure For Mpi Collective Operations

PROCEEDINGS 2018 IEEE 11TH INTERNATIONAL CONFERENCE ON CLOUD COMPUTING (CLOUD)(2018)

引用 13|浏览70
暂无评分
摘要
We present COOL, a simple and generic structure for MPI collective operations. COOL enables highly efficient designs for all collective operations in the cloud. We then present a system design based on COOL that implements frequently used collective operations. Our design efficiently uses the intra-rack network while minimizing cross-rack communication, thus improving the application performance and scalability. We use recent software-defined networking capabilities to build optimal network paths for I/O intensive collective operations. Our analytical evaluation shows that our design imposes the least possible network overhead across racks. Furthermore, when compared with OpenMPI and MPICH, our design reduces the number of steps to only three, decreases the number of exchanged messages by a factor of N, the total number of processes, and reduces the network load by up to an order of magnitude. These significant improvements come at the cost of a modest increase in the computation load on a few processes.
更多
查看译文
关键词
MPI, collective operations, communication patterns, cloud, software-defined networking
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要