SLOW5: a new file format enables massive acceleration of nanopore sequencing data analysis

user-5dd52aee530c701191bf1b99(2021)

引用 1|浏览10
暂无评分
摘要
ABSTRACT Nanopore sequencing is an emerging genomic technology with great potential. However, the storage and analysis of nanopore sequencing data have become major bottlenecks preventing more widespread adoption in research and clinical genomics. Here, we elucidate an inherent limitation in the file format used to store raw nanopore data – known as FAST5 – that prevents efficient analysis on high-performance computing (HPC) systems. To overcome this, we have developed SLOW5, an alternative file format that permits efficient parallelisation and, thereby, acceleration of nanopore data analysis. For example, we show that using SLOW5 format, instead of FAST5, reduces the time and cost of genome-wide DNA methylation profiling by an order of magnitude on common HPC systems, and delivers consistent improvements on a wide range of different architectures. With a simple, accessible file structure and a ~25% reduction in size compared to FAST5, SLOW5 format will deliver substantial benefits to all areas of the nanopore community.
更多
查看译文
关键词
Nanopore sequencing,File format,Nanopore,Embedded system,Computer science,Reduction (complexity),Acceleration,Clinical genomics,Dna methylation profiling,Genomic technology
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要