Strategies To Avoid Wrongly Labelled Genomes Using As Example The Detected Wrong Taxonomic Affiliation For Aeromonas Genomes In The Genbank Database

PLOS ONE(2015)

引用 73|浏览7
暂无评分
摘要
Around 27,000 prokaryote genomes are presently deposited in the Genome database of GenBank at the National Center for Biotechnology Information (NCBI) and this number is exponentially growing. However, it is not known how many of these genomes correspond correctly to their designated taxon. The taxonomic affiliation of 44 Aeromonas genomes (only five of these are type strains) deposited at the NCBI was determined by a multilocus phylogenetic analysis (MLPA) and by pairwise average nucleotide identity (ANI). Discordant results in relation to taxa assignation were found for 14 (35.9%) of the 39 non-type strain genomes on the basis of both the MLPA and ANI results. Data presented in this study also demonstrated that if the genome of the type strain is not available, a genome of the same species correctly identified can be used as a reference for ANI calculations. Of the three ANI calculating tools compared (ANI calculator, EzGenome and JSpecies), EzGenome and JSpecies provided very similar results. However, the ANI calculator provided higher intra- and inter-species values than the other two tools (differences within the ranges 0.06-0.82% and 0.92-3.38%, respectively). Nevertheless each of these tools produced the same species classification for the studied Aeromonas genomes. To avoid possible misinterpretations with the ANI calculator, particularly when values are at the borderline of the 95% cutoff, one of the other calculation tools (EzGenome or JSpecies) should be used in combination. It is recommended that once a genome sequence is obtained the correct taxonomic affiliation is verified using ANI or a MLPA before it is submitted to the NCBI and that researchers should amend the existing taxonomic errors present in databases.
更多
查看译文
关键词
biomedical research,bioinformatics
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要