Reproducibility of execution environments in computational science using Semantics and Clouds

Future Generation Computer Systems(2017)

引用 44|浏览24
暂无评分
摘要
In the past decades, one of the most common forms of addressing reproducibility in scientific workflow-based computational science has consisted of tracking the provenance of the produced and published results. Such provenance allows inspecting intermediate and final results, improves understanding, and permits replaying a workflow execution. Nevertheless, this approach does not provide any means for capturing and sharing the very valuable knowledge about the experimental equipment of a computational experiment, i.e., the execution environment in which the experiments are conducted. In this work, we propose a novel approach based on semantic vocabularies that describes the execution environment of scientific workflows, so as to conserve it. We define a process for documenting the workflow application and its related management system, as well as their dependencies. Then we apply this approach over three different real workflow applications running in three distinct scenarios, using public, private, and local Cloud platforms. In particular, we study one astronomy workflow and two life science workflows for genomic information analysis. Experimental results show that our approach can reproduce an equivalent execution environment of a predefined virtual machine image on all evaluated computing platforms.
更多
查看译文
关键词
Semantic metadata,Scientific workflow,Reproducibility,Life sciences
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要