CWL-PLAS: Task Workflows Assisted by Data Science Cloud Platforms

IEEE Access(2023)

引用 0|浏览5
暂无评分
摘要
The Common Workflow Language (CWL) is a platform-independent description language for the representation of data science workflows consisting of a set of tasks that interact with each other to perform scientific analysis. The tasks can be packaged as Linux containers. On the one hand, using containers ensures the reproducibility and portability of workflows. Still, on the other hand, it limits each task to exploiting, at most, the resources of the host where its container runs. In this paper, we propose CWL-PLAS, an extension of CWL that allows a task to instantiate and temporarily use a supporting cloud platform for parallel computing, which is specialized for the task's activity. In this way, tasks can leverage the resources of multiple hosts in parallel, reducing the duration of the workflow. We implemented an open-source workflow manager that supports CWL-PLAS workflows and exploits a Kubernetes back-end. We used this workflow manager to evaluate the performance of CWL-PLAS in a couple of machine learning workflows.
更多
查看译文
关键词
Computer languages,Software management and development,Parallel processing,Processor scheduling,Standards,Portfolios,Distributed computing,Common workflow language,workflow management software,distributed computing,cloud
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要