FleetRec: Large-Scale Recommendation Inference on Hybrid GPU-FPGA Clusters

Knowledge Discovery and Data Mining(2021)

引用 29|浏览77
暂无评分
摘要
ABSTRACTWe present FleetRec, a high-performance and scalable recommendation inference system within tight latency constraints. FleetRec takes advantage of heterogeneous hardware including GPUs and the latest FPGAs equipped with high-bandwidth memory. By disaggregating computation and memory to different types of hardware and bridging their connections by high-speed network, FleetRec gains the best of both worlds, and can naturally scale out by adding nodes to the cluster. Experiments on three production models up to 114 GB show that FleetRec outperforms optimized CPU baseline by more than one order of magnitude in terms of throughput while achieving significantly lower latency.
更多
查看译文
关键词
Recommendation System, Hardware Acceleration, FPGA, GPU
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要