Improving resource usage in hpc clouds

V. Antonenko,A. Chupakhin,I. Petrov, R. Smeliansky

semanticscholar(2019)

引用 1|浏览5
暂无评分
摘要
Nowadays many supercomputer users are dissatisfied with a long waiting time for their jobs in the supercomputer queue. Therefore, to reduce the queue of jobs to the supercomputer, we suggest use cloud resources (HPC-as-a-service). Our main goal is to decrease wait time plus execution time for jobs in supercomputer. One of the key drawbacks associated with HPC-clouds is low CPU usage due to the network communication overhead. Instances of HPC applications may reside on different physical machines separated by significant network latencies and network communications may consume significant time and thus result in CPU stalls. In this paper we present and check hypothesis: “MPI programs that don’t require a lot of computing resources can effectively share the same set of resources”. It’s possible when network in the cloud is slow or MPI programs can intensively use the network resources and not intensively use computational resources. Thus, such programs can run simultaneously without significant slowdown, because when one program is waiting to receive data over the network, CPU stalls and can execute another program. We checked our hypothesis on popular MPI benchmarks – NAS Parallel Benchmarks (NPB). The experiments have shown that we can improve the CPU usage in the cloud with negligible performance degradation of HPC-applications execution (in terms of time spent).
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要