Capturing provenance information for biomedical data and workflows: A scoping review

Research Square (Research Square)(2023)

引用 0|浏览2
暂无评分
摘要
Abstract Background: Provenance enriched scientific results ensure their reproducibility and trustworthiness, particularly when containing sensitive data. Provenance information leads to higher interpretability of scientific results and enables reliable collaboration and data sharing. However, the lack of comprehensive evidence on provenance approaches hinders the uptake of good scientific practice in clinical research. Our scoping review identifies evidence regarding approaches and criteria for provenance tracking in the biomedical domain. We investigate the state-of-the-art frameworks, associated artifacts, and methodologies for provenance tracking. Methods: This scoping review followed the methodological framework by Arksey and O'Malley. PubMed and Web of Science databases were searched for English-language articles published from January 1, 2006, to March 23, 2021. Title and abstract screening were carried out by four independent reviewers using the Rayyan screening tool. A majority vote was required for consent on the eligibility of papers based on the defined inclusion and exclusion criteria. Full-text reading and screening were performed independently by two reviewers, and information was extracted into a pre-tested template for the five research questions. Disagreements were resolved by a domain expert. The study protocol has previously been published. Results: The search resulted in a total of 564 papers. Of 469 identified, de-duplicated papers, 54 studies fulfilled the inclusion criteria and were subjected to five research questions. The review identified the heterogeneous tracking approaches, their artifacts, and varying degrees of fulfillment of the research questions. Based on this, we developed a roadmap for a tailor-made provenance framework considering the software life cycle. Conclusions: In this paper we investigate the state-of-the-art frameworks, associated artifacts, and methodologies for provenance tracking including real-life applications. We observe that most authors imply ideal conditions for provenance tracking. However, our analysis discloses several gaps for which we illustrate future steps toward a systematic provenance strategy. We believe the recommendations enforce quality and guide the implementation of auditable and measurable provenance approaches as well as solutions in the daily routine of biomedical scientists.
更多
查看译文
关键词
provenance information,biomedical data,workflows,scoping review
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要