Exploring bacterial diversity via a curated and searchable snapshot of archived DNA

Access Microbiology(2022)

引用 0|浏览15
暂无评分
摘要
The open sharing of genomic data provides an incredibly rich resource for the study of bacterial evolution and function, and even anthropogenic perturbations such as the widespread use of antimicrobials. Whilst these archives are rich in data, considerable processing is required before a biological question can be addressed. Here, we have assembled, quality controlled and characterised 661,405 bacterial genomes that were in the European Nucleotide Archive (ENA) at the end of November of 2018, using a uniform standardised approach. A searchable index has been produced, facilitating the easy interrogation of the entire dataset for a specific gene or mutation. Our analysis shows how uneven the species composition is within this database, with just 20 of the total 2,336 species making up 90% of the high-quality genomes. The over-represented species tend to be acute/common human pathogens, often aligning with research priorities at different levels from individuals with targeted but focussed research questions, areas of focus for the funding bodies or national public health agencies, to those identified globally as priority pathogens by the WHO for their resistance to front- and last-line antimicrobials. Whilst this is a rich resource which often forms the context or references for multi-‘omic’ studies and supports discovery research in many domains, understanding the actual and potential biases in bacterial diversity depicted in this snapshot, and hence within the data being submitted to the public sequencing archives, is essential if we are to target and fill gaps in our understanding of the bacterial kingdom.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要