Specializing the network for scatter-gather workloads

SoCC '20: ACM Symposium on Cloud Computing Virtual Event USA October, 2020(2020)

引用 6|浏览23
暂无评分
摘要
Data processing and distributed querying workloads often involve a "scatter-gather" or "partition-aggregate" architectural pattern, whereby one application queries hundreds or even thousands of workers. Network communication is often a bottleneck in this pattern, especially when the compute task at each worker is small, such as for Web queries and interactive analytics. The network bottleneck can result in low throughput, high CPU utilization, and cause job completion time to increase by orders of magnitude. To overcome these inefficiencies, we explore hardware-offload of the scatter-gather primitive, whereby a smart NIC takes on the responsibility of sending out queries and collecting responses. We show that this approach not only virtually eliminates CPU usage, but with suitable scheduling of responses, it also speeds up scatter by allowing parallel queries, and gather by preventing throughput collapse due to excessive congestion. Besides response scheduling, we use a careful design at the NIC to limit FPGA resource usage: our approach uses about 25% of on-chip logic and 33% of on-chip memory on a mid-sized FPGA, leaving enough room for implementing other functions on the smart NIC.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要