MPI Meets Cloud: Case Study with Amazon EC2 and Microsoft Azure

Shulei Xu,Seyedeh Mahdieh Ghazimirsaeed,Jahanzeb Maqbool Hashmi,Hari Subramoni,Dhabaleswar K. Panda

2020 IEEE/ACM Fourth Annual Workshop on Emerging Parallel and Distributed Runtime Systems and Middleware (IPDRM)（2020）

引用 5|浏览11

暂无评分

摘要

Scientists have traditionally employed High Performance Computing (HPC) systems to scale parallel scientific applications. Over the last decade, cloud based systems have been trying to meet the performance of bare-metal hardware while keeping the benefits of computing-as-service model intact. Today's HPC cloud offerings such as Microsoft Azure H-series and Amazon EC2 provide near native performance, high cus-tomizability, resource provisioning and isolation, and elasticity to dynamically scale applications. MPI is the defacto programming model used to scale these scientific applications on to large-scale supercomputers. While several studies have been carried out to understand the performance of MPI libraries and applications on various HPC systems, understanding the characteristics of them on the state-of-the-art HPC clouds is still lacking in the literature. In this paper, we provide a systematic study of HPC applications and benchmarks on two popular HPC clouds. The overarching goal is to understand the performance characteristics of HPC clouds and study the impact of customizability on the usage of MPI. We do a case study of performance optimization on two popular HPC clouds, and obtain up to 150% improvement in point-to-point communication bandwidth. We also compare the performance of several different MPI libraries for various micro-benchmarks and scientific applications (WRF) on two HPC clouds and one native supercomputer (Frontera). Based on our insights gained from this study, we provide a set of guidelines for MPI application developers to scale their applications on clouds and achieve the best performance.

查看译文

关键词

HPC Clouds,MPI,Amazon EC2,Microsoft Azure

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要