Improving Network Throughput with Global Communication Reordering

2018 IEEE International Parallel and Distributed Processing Symposium (IPDPS)(2018)

引用 1|浏览51
暂无评分
摘要
We present a methodology to improve network throughput by reordering communication in HPC codes. In contrast to all previous work, our approach does not require any information about network and application communication topology. We implement on-the-fly algorithms that reorder message streams based on statistics inferred during execution. Our intuition is that long operations that occur in ranks that exhibit large execution variability need to be prioritized. We consider two approaches: 1) reorder using statistics of a group of significant ranks; and 2) reorder around an outlier rank. For robustness on noisy systems, our final algorithm combines group and outlier reordering and it allows continuous adaptation of the schedule. We validate on two different networks: Cray Aries and InfiniBand. Micro-benchmarks show that performance between two different schedules of communication can be as high as 74%. Given an initial ordering, our algorithm can recuperate as much as 90% from the potential perfor-mance improvements. When employed in applications, we see improvements as large as 70% in communication times. If interference is present, the algorithm additionally reduces outliers and variance in the communication times.
更多
查看译文
关键词
message reordering,RDMA,MPI,InfiniBand,global order
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要