FlowStar: Fast Convergence Per-Flow State Accurate Congestion Control for InfiniBand

Changyun Luo,Huaxi Gu,Lijing Zhu, Huixia Zhang

IEEE/ACM Transactions on Networking(2024)

引用 0|浏览0
暂无评分
摘要
According to the latest TOP500 list, InfiniBand (IB) is the most widely used network architecture in the top 10 supercomputers. IB relies on Credit-based Flow Control (CBFC) to provide a lossless network and InfiniBand congestion control (IB CC) to relieve congestion, however, this can lead to the problem of victim flow since messages are mixed in the same queue and long-lived congestion spreading due to slow convergence. To deal with these problems, in this paper, we propose FlowStar, a fast convergence per-flow state accurate congestion control for InfiniBand. FlowStar includes two core mechanisms: 1) optimized per-flow CBFC mechanism provides flow state control to detect real congestion; and 2) rate adjustment rules make up for the mismatch between the original IB CC rate regulation and the per-hop CBFC to alleviate congestion spreading. FlowStar implements a per-flow congestion state on switches and can obtain in-flight packet information without additional parameter settings to ensure a lossless network. Evaluations show that FlowStar improves average and tail message complete time under different workloads.
更多
查看译文
关键词
InfiniBand,flow confrol,congestion control
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要