Effect of Sequence Depth and Length in Long-read Assembly of the Maize Inbred NC358

biorxiv(2019)

引用 0|浏览0
暂无评分
摘要
Recent improvements in the quality and yield of long-read data and scaffolding technology have made it possible to rapidly generate reference-quality assemblies for complex genomes. Still, generating these assemblies is costly, and an assessment of critical sequence depth and read length to obtain high-quality assemblies is important for allocating limited resources. To this end, we have generated eight independent assemblies for the complex genome of the maize inbred line NC358 using PacBio datasets ranging from 20-75x genomic depth and N50 read lengths of 11-21 kb. Assemblies with 30x or less depth and N50 read length of 11 kb were highly fragmented, with even the low-copy genic fraction of the genome showing degradation at 20x depth. Distinct sequence-quality thresholds were observed for complete assembly of genes, transposable elements, and highly repetitive genomic features such as telomeres, heterochromatic knobs and centromeres. This study provides a useful resource allocation reference to the community as long-read technologies continue to mature.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要