Battle of the BlueFields: An In-Depth Comparison of the BlueField-2 and BlueField-3 SmartNICs

2023 IEEE Symposium on High-Performance Interconnects (HOTI)(2023)

引用 0|浏览3
暂无评分
摘要
Over the past several years, Smart Network Interface Cards (NIC/SmartNICs) have rapidly evolved in popularity. In particular, NVIDIA’s BlueField line of SmartNICs has been effective in a wide variety of uses: Offloading communication in High-Performance Computing applications (HPC), various stages of the Deep Learning (DL) pipeline, and is designed especially for Datacenter/virtualization uses. The BlueField-3 DPU was released at the end of 2022 as a follow-up to its widely accepted BlueField-2 predecessor, and this work will serve as an in-depth performance evaluation between the two to show a) a comparison of both SmartNICs’ on-chip capabilities (memory bandwidth, compute speed, etc.), and b) their offload capabilities through several micro/benchmarks and applications. In single-DPU programs, we see up to 61% improvements in the latency of a memcpy operation and up to 82% bandwidth improvement in the use of the STREAM benchmark [8] on the BlueField-3. With the use of a DPU-aware MPI library [1], we observe over 30% improvement at the micro-benchmark level when comparing staging-based designs on both SmartNICs and up to nearly double that in the context of an application with staging-based designs. However, GVMI (Guest Virtual Machine ID) based designs contained in said library do not exceed 10% at the benchmark level and provide less than 2% benefits in applications because of its architecture-insensitive nature — that is, while CPU clock speed may impact the completion time of instructions, the performance of the GVMI-based designs in a DPU-aware MPI library will largely be unaffected by swapping the BlueField-2 for a BlueField-3.
更多
查看译文
关键词
Datacenter Processing Units,BlueField-2,BlueField-3,SmartNIC,High-Performance Computing,Offload,Interconnects
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要