Computing in the Era of Large Generative Models: From Cloud-Native to AI-Native

Yao Lu, Song Bian,Lequn Chen, Yongjun He, Yulong Hui,Matthew Lentz, Beibin Li, Fei Liu, Jialin Li, Qi Liu,Rui Liu, Xiaoxuan Liu,Lin Ma,Kexin Rong, Jianguo Wang, Yingjun Wu,Yongji Wu,Huanchen Zhang,Minjia Zhang, Qizhen Zhang, Tianyi Zhou,Danyang Zhuo

CoRR(2024)

引用 0|浏览2
暂无评分
摘要
In this paper, we investigate the intersection of large generative AI models and cloud-native computing architectures. Recent large models such as ChatGPT, while revolutionary in their capabilities, face challenges like escalating costs and demand for high-end GPUs. Drawing analogies between large-model-as-a-service (LMaaS) and cloud database-as-a-service (DBaaS), we describe an AI-native computing paradigm that harnesses the power of both cloud-native technologies (e.g., multi-tenancy and serverless computing) and advanced machine learning runtime (e.g., batched LoRA inference). These joint efforts aim to optimize costs-of-goods-sold (COGS) and improve resource accessibility. The journey of merging these two domains is just at the beginning and we hope to stimulate future research and development in this area.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要