A Study of Orchestration Approaches for Scientific Workflows in Serverless Computing

SESAME '23: Proceedings of the 1st Workshop on SErverless Systems, Applications and MEthodologies(2023)

引用 0|浏览14
暂无评分
摘要
Scientific workflows are typically data- and compute-intensive. They consist of many stages, each of which may contain hundreds to even thousands of tasks. Traditionally, scientific workflows have been executed using the serverful computing model. Serverless computing presents an attractive alternative to the serverful computing model as it frees developers from managing and provisioning resources and offers a fine-grained pay-as-you-go pricing model. In this paper, we investigate the viability of using serverless computing to execute scientific workflows. Specifically, we discuss, implement, and evaluate three orchestration approaches for executing scientific workflows: serverful-centralized, serverless-centralized, and serverless-decentralized. This work is the first to implement and evaluate a purely serverless orchestration approach that does not require deploying a dedicated workflow manager. Our evaluation shows that serverless orchestration approaches cause a noticeable performance overhead for some workflow patterns (e.g., reduce stages) due to accessing a large amount of remote data. We propose two optimizations (i.e., prefetching file privileges and container placement) that exploit data locality to mitigate that impact. Our evaluation with the Montage application shows that a fully decentralized approach achieves a comparable performance to a serverful approach. Also, our results show that prefetching file privileges and container placement optimizations improve the performance by 26% and 44% respectively when compared to an unoptimized version.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要