An efficient large-scale whole-genome sequencing analyses practice with an average daily analysis of 100Tbp: ZBOLT

Zhichao Li,Yinlong Xie,Wenjun Zeng, Yushan Huang, Shengchang Gu,Ya Gao,Weihua Huang, Lihua Lu, Xiaohong Wang, Jiasheng Wu, Xiaoxu Yin, Rongyi Zhu,Guodong Huang, Lin Lu, Jingbo Tang, Yunping Zheng, Quan Liu, Xianqiang Zhou, Riqiang Shan,Bo Wang,Mingyan Fang,Xin Jin

CLINICAL AND TRANSLATIONAL DISCOVERY(2023)

引用 0|浏览4
暂无评分
摘要
BackgroundWith the advancement of whole-genome sequencing (WGS) technology, massively parallel sequencing (MPS) remains the mainstream due to its accuracy, low cost, and high throughput. The development of the analytical pipeline corresponding to MPS has always been of great importance. Increasingly large population genomics studies, as a specific type of big data research, pose new challenges for analysis solutions.ResultsHere, we introduce ZBOLT, a comprehensive analysis system that incorporates both software and hardware advancements, making it an appropriate choice for large-scale population genomic studies that require extensive data processing. In this study, we first evaluate ZBOLT's calling accuracy using the Genome in a Bottle (GIAB) benchmark dataset. Then we apply ZBOLT to a large-scale population genomics study with 5,616 high sequencing depth samples totaling 1.16Pbp (base pair). As the results show, ZBOLT demonstrates exceptional efficiency and low energy consumption, processing 100Tbp per day and using 1kWh per 100Gbp sequenced sample.ConclusionThis research serves as a valuable reference for analyzing sequencing data from large population cohorts and underscores the significant potential of ZBOLT in large-scale population genomics studies. ZBOLT is a comprehensive analysis system that incorporates both software and hardware advancements, making it an appropriate choice for large-scale population genomic studies.ZBOLT's calling accuracy is comparable to GATK Best Practices.ZBOLT is able to analyze 100Tbp of raw sequencing data per day.ZBOLT consumes 1 kWh per 100Gbp sequenced sample. image
更多
查看译文
关键词
efficient,large-scale,WGS analysis,ZBOLT
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要