Advanced Whole Genome Sequencing Using a Complete PCR-free Massively Parallel Sequencing (MPS) Workflow

biorxiv(2019)

引用 3|浏览7
暂无评分
摘要
Systematic errors could be introduced by amplification during MPS library preparation and cluster/array formation. Polymerase Chain Reaction (PCR)-free library preparation methods have previously demonstrated improved sequencing quality with PCR-amplified read-clusters, however we hypothesized that some some InDel errors are still introduced by the remaining PCR step. Here we sequenced PCR-free libraries on MGI‘s PCR-free DNBSEQ arrays to obtain for the first time a true PCR-free WGS (Whole Genome Sequencing). We used MGI’s PCR-free WGS library preparation kits as recommended or with some modifications to make several NA12878 libraries. Reproducibly high quality libraries where obtained with low bias and less than 1% read duplication for both ultrasonic and enzymatic DNA fragmenting.In a triplicate analysis, over 99% SNPs and about 98% indels in each library were found in at least one of the other two libraries. Using machine learning (ML) methods (DeepVariant or DNAscope), variant calling performance (SNPs F-measure>99.94%, InDels F-measure>99.6%) exceeded the widely accepted standards. The F-measure of 15X PCR-free ML-WGS was comparable to or even better than 30X PCR WGS analyzed with GATK. Furthermore, PCR-free WGS libraries sequenced on PCR-free DNBSEQ platform have up to 55% less InDel errors compared to NovaSeq platform confirming that DNA clusters have PCR-generated errors.Enabled by the new PCR-free library kits, super high-thoughput sequencer and ML-based variant calling, DNBSEQ true PCR-free WGS provides a powerful solution to improve accuracy while reducing cost and analysis time to facilitate future precision medicine, cohort studies and large population genome project.
更多
查看译文
关键词
WGS,PCR-free,DNBSEQ<sup>TM</sup>,Machine learning based variant calling
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要