Reproducible acquisition, management, and meta-analysis of nucleotide sequence (meta)data using q2-fondue

bioRxiv (Cold Spring Harbor Laboratory)(2022)

引用 0|浏览2
暂无评分
摘要
AbstractThe volume of public nucleotide sequence data has blossomed over the past two decades, enabling novel discoveries via re-analysis, meta-analyses, and comparative studies for uncovering general biological trends. However, reproducible re-use and management of sequence datasets remains a challenge. We created the software plugin q2-fondue to enable user-friendly acquisition, re-use, and management of public nucleotide sequence (meta)data while adhering to open data principles. The software allows fully provenance-tracked programmatic access to and management of data from the Sequence Read Archive (SRA). Sequence data and accompanying metadata retrieved with q2-fondue follow a validated format, which is interoperable with the QIIME 2 ecosystem and its multiple user interfaces. To highlight the manifold capabilities of q2-fondue, we present several demonstration analyses using amplicon, whole genome, and shotgun metagenome datasets. These use cases demonstrate how q2-fondue increases analysis reproducibility and transparency from data download to final visualizations by including source details in the integrated provenance graph. We believe q2-fondue will lower existing barriers to comparative analyses of nucleotide sequence data, enabling more transparent, open, and reproducible conduct of meta-analyses. q2-fondue is a Python 3 package released under the BSD 3-clause license at https://github.com/bokulich-lab/q2-fondue.
更多
查看译文
关键词
nucleotide sequence,reproducible acquisition,meta-analysis meta-analysis
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要