Identification of errors in draft genome assemblies at single-nucleotide resolution for quality assessment and improvement.

Nature communications(2023)

引用 1|浏览8
暂无评分
摘要
Assembly of a high-quality genome is important for downstream comparative and functional genomic studies. However, most tools for genome assembly assessment only give qualitative reports, which do not pinpoint assembly errors at specific regions. Here, we develop a new reference-free tool, Clipping information for Revealing Assembly Quality (CRAQ), which maps raw reads back to assembled sequences to identify regional and structural assembly errors based on effective clipped alignment information. Error counts are transformed into corresponding assembly evaluation indexes to reflect the assembly quality at single-nucleotide resolution. Notably, CRAQ distinguishes assembly errors from heterozygous sites or structural differences between haplotypes. This tool can clearly indicate low-quality regions and potential structural error breakpoints; thus, it can identify misjoined regions that should be split for further scaffold building and improvement of the assembly. We have benchmarked CRAQ on multiple genomes assembled using different strategies, and demonstrated the misjoin correction for improving the constructed pseudomolecules.
更多
查看译文
关键词
genome assemblies,quality assessment,draft,single-nucleotide
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要