MIMt: A curated 16S rRNA reference database with less redundancy and higher accuracy at species-level identification

Maria Pilar Cabezas Rodriguez,Nuno A. Fonseca,Antonio Munoz-Merida

biorxiv(2023)

引用 0|浏览2
暂无评分
摘要
Motivation: Accurate determination and quantification of the taxonomic composition of microbial communities, especially at the species level, is one of the major issues in metagenomics. This is primarily due to the limitations of commonly used 16S rRNA reference databases, which either contain a lot of redundancy, or a high percentage of sequences with missing taxonomic information. The use of these incomplete or biased databases may lead to erroneous identifications and, thus, to erroneous conclusions regarding the ecological role and importance of those microorganisms in the ecosystem. Results: The current study presents MIMt, a new 16S rRNA database for archaea and bacteria identification, encompassing 39 940 sequences, all precisely identified at species level. MIMt aims to be updated at least once a year to include all new sequenced species. We evaluated MIMt against Greengenes, RDP, GTDB and SILVA in terms of sequence distribution and accuracy of taxonomic assignments. Our results showed that MIMt contains less redundancy, and despite being five to 85 times smaller in size than existing databases, outperforms them in completeness and taxonomic accuracy, enabling more precise assignments at lower taxonomic ranks and thus, significantly improving species-level identification. Availability and Implementation: MIMt is freely available for non-commercial purposes at https://mimt.bu.at.biopolis.pt ### Competing Interest Statement The authors have declared no competing interest.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要