SMSS: Stateful Model Serving in Metaverse With Serverless Computing and GPU Sharing

Zinuo Cai, Zebin Chen, Ruhui Ma,Haibing Guan

IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS(2024)

引用 0|浏览4
暂无评分
摘要
With the rapid development of information technology, the concept of the Metaverse has swept the world and set off a new wave of the industrial revolution. The construction of living and manufacturing scenes based on the Metaverse requires the joint participation of scientists and engineers from various fields where "human" is at the core. In the Metaverse, predicting human behavior and response based on the deep learning model is meaningful because the prediction results can provide more satisfactory services for participants. Therefore, how to deploy a multi-stage machine learning reasoning model has become the bottleneck to improving the development level of Metaverse. Thanks to its scalability and pay-as-you-go billing model, the emerging serverless computing can effectively cope with the workload of machine learning inference. However, the statelessness of serverless computing and the lack of good GPU resource-sharing support make it difficult to deploy the machine learning model directly on the serverless computing platform to play its advantages. Therefore, we propose SMSS, a stateful model inference service, which is deployed on a serverless computing platform that supports GPU sharing. Since the serverless computing platform does not support stateful workflow execution, SMSS adopts log-based workflow runtime support. We also design a mechanism of two-layer GPU sharing to fully explore the potential of inter-model and intra-model GPU sharing. We evaluate the effectiveness of SMSS with real workloads. Our experimental results show that log-based stateful workflow operation support can ensure the stateful execution of tasks with low overhead but facilitate error location and recovery. Two-layer GPU Sharing can reduce the cold start time of inference tasks to two orders of magnitude at most.
更多
查看译文
关键词
Metaverse,serverless computing,model serving,stateful workflow,GPU sharing
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要