Data proliferation, reconciliation, and synthesis in viral ecology

bioRxiv (Cold Spring Harbor Laboratory)(2021)

引用 0|浏览0
暂无评分
摘要
Abstract The fields of viral ecology and evolution have rapidly expanded in the last two decades, driven by technological improvements, and motivated by efforts to discover potentially zoonotic wildlife viruses under the rubric of pandemic prevention. One consequence has been a massive proliferation of host-virus association data, which comprise the backbone of research in viral macroecology and zoonotic risk prediction. These data remain fragmented across numerous data portals and projects, each with their own scope, structure, and reporting standards. Here, we propose that synthesis of host-virus association data is a central challenge to improve our understanding of the global virome and develop foundational theory in viral ecology. To illustrate this, we build an open reconciled mammal-virus database from four key published datasets, applying a standardized taxonomy and metadata. We show that reconciling these datasets provides a substantially richer view of the mammal virome than that offered by any one individual database. We argue for a shift in best practice towards the incremental development and use of synthetic datasets in viral ecology research, both to improve comparability and replicability across studies, and to facilitate future efforts to use machine learning to predict the structure and dynamics of the global virome.
更多
查看译文
关键词
viral ecology,reconciliation,data
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要