Disaggregated GPU Acceleration for Serverless Applications.

ACM SIGOPS Oper. Syst. Rev.(2023)

引用 0|浏览18
暂无评分
摘要
Serverless platforms have been attracting applications from traditional platforms because infrastructure management responsibilities are shifted from users to providers. Many applications well-suited to serverless environments could leverage GPU acceleration to enhance their performance. Unfortunately, current serverless platforms do not expose GPUs to serverless applications. We present DGSF, a platform that enables serverless applications to access virtualized GPUs by disaggregating resources. DGSF facilitates provisioning and addresses utilization challenges by allowing a small pool of remote physical GPUs to serve potentially many serverless applications concurrently. With DGSF, the cloud provider decouples GPU resources from others, facilitating resource consolidation. In this article, we describe how DGSF tackles GPU disaggregation challenges using API remoting virtualization, and optimizations, which include hiding communication latency and pooling resources. Our evaluation shows that these API remoting optimizations can lower the runtime of an application by up to 50% relative to an unoptimized API remoting scheme. Because these optimizations aggressively remove the latency of GPU runtime and object management from the application's critical path, they can enable applications executing on DGSF to have lower end-to-end time than when running on a GPU natively. Through consolidation, DGSF can lower queueing delays of application that use GPUs by up to 53%. We also demonstrate DGSF's flexibility by augmenting applications on AWS Lambda with GPU support.
更多
查看译文
关键词
gpu acceleration
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要