Hybrid assembly of ultra-long nanopore reads augmented with 10×-genomics contigs: Demonstrated with a human genome.

Genomics(2019)

引用 23|浏览14
暂无评分
摘要
The 3rd generation of sequencing (3GS) technologies generate ultra-long reads (up to 1 Mb), which makes it possible to eliminate gaps and effectively resolve repeats in genome assembly. However, the 3GS technologies suffer from the high base-level error rates (15%–40%) and high sequencing costs. To address these issues, the hybrid assembly strategy, which utilizes both 3GS reads and inexpensive NGS (next generation sequencing) short reads, was invented. Here, we use 10×-Genomics® technology, which integrates a novel bar-coding strategy with Illumina® NGS with an advantage of revealing long-range sequence information, to replace common NGS short reads for hybrid assembly of long erroneous 3GS reads. We demonstrate the feasibility of integrating the 3GS with 10×-Genomics technologies for a new strategy of hybrid de novo genome assembly by utilizing DBG2OLC and Sparc software packages, previously developed by the authors for regular hybrid assembly. Using a human genome as an example, we show that with only 7× coverage of ultra-long Nanopore® reads, augmented with 10× reads, our approach achieved nearly the same level of quality, compared with non-hybrid assembly with 35× coverage of Nanopore reads. Compared with the assembly with 10×-Genomics reads alone, our assembly is gapless with slightly high cost. These results suggest that our new hybrid assembly with ultra-long 3GS reads augmented with 10×-Genomics reads offers a low-cost (less than ¼ the cost of the non-hybrid assembly) and computationally light-weighted (only took 109 calendar hours with peak memory-usage = 61GB on a dual-CPU office workstation) solution for extending the wide applications of the 3GS technologies.
更多
查看译文
关键词
3GS (3rd generation sequencing),10× Genomics,DBG2OLC,Sparc,Hybrid assembly,Nanopore,Human genome
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要