miniBUSCO: a faster and more accurate reimplementation of BUSCO

Ning Huang,Heng Li

bioRxiv (Cold Spring Harbor Laboratory)(2023)

引用 4|浏览0
暂无评分
摘要
Assembly completeness evaluation of genome assembly is a critical assessment of the accuracy and reliability of genomic data. An incomplete assembly can lead to errors in gene predictions, annotation, and other downstream analyses. BUSCO is one of the most widely used tools for assessing the completeness of genome assembly by comparing the presence of a set of single-copy orthologs conserved across a wide range of taxa. However, the runtime of BUSCO can be long, particularly for some large genome assemblies. It is a challenge for researchers to quickly iterate the genome assemblies or analyze a large number of assemblies.Here, we present miniBUSCO, an efficient tool for assessing the completeness of genome assemblies. miniBUSCO utilizes the protein-to-genome aligner miniprot and the datasets of conserved orthologous genes from BUSCO. Our evaluation of the real human assembly indicates that miniBUSCO achieves a 14-fold speedup over BUSCO. Furthermore, miniBUSCO reports a more accurate completeness of 99.6% than BUSCO's completeness of 95.7%, which is in close agreement with the annotation completeness of 99.5% for T2T-CHM13.https://github.com/huangnengCSU/minibusco .hli@ds.dfci.harvard.edu.Supplementary data are available at Bioinformatics online.
更多
查看译文
关键词
minibusco,accurate reimplementation
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要