DCQCN+: Taming Large-Scale Incast Congestion in RDMA over Ethernet Networks

2018 IEEE 26th International Conference on Network Protocols (ICNP)(2018)

引用 24|浏览134
暂无评分
摘要
Remote Direct Memory Access (RDMA) gains growing popularity in datacenter networks. The state-of-the-art congestion control scheme is DCQCN. However, DCQCN has performance problems when large-scale incast communication happens. DCQCN uses fixed period and steps for rate increase when probing for available bandwidth and this scheme is not scalable. Our key insight is that: senders should be aware of the scale of each incast, so that they can adjust their aggressiveness accordingly. The challenges come from different aspects. The scale of congestion is not easy to estimate while the control scheme should be cautiously designed. In this paper, we propose DCQCN+ to improve performance for large-scale incast congestion in RDMA networks. DCQCN+ adapts the rate control mechanisms to different scenarios. DCQCN+ can deal with incast congestion of at least 2,000 flows both in simulation and testbed. The scale is 10 times larger than that of DCQCN in simulation and 4 times larger in testbed. DCQCN+ also has 10 times smaller latency.
更多
查看译文
关键词
RDMA,Congestion Control,DCQCN,Incast
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要