FPGA acceleration of reference-based compression for genomic data

2015 International Conference on Field Programmable Technology (FPT)(2015)

引用 17|浏览39
暂无评分
摘要
One of the key challenges facing genomics today is efficiently storing the massive amounts of data generated by next-generation sequencing platforms. Reference-based compression is a popular strategy for reducing the size of genomic data, whereby sequence information is encoded as a mapping to a known reference sequence. Determining the mapping is a computationally intensive problem, and is the bottleneck of most reference-based compression tools currently available. This paper presents the first FPGA acceleration of reference-based compression for genomic data. We develop a new mapping algorithm based on the FM-index search operation which includes optimisations targeting the compression ratio and speed. Our hardware design is implemented on a Maxeler MPC-X2000 node comprising 8 Altera Stratix V FPGAs. When evaluated against compression tools currently available, our tool achieves a superior compression ratio, compression time, and energy consumption for both FASTA and FASTQ formats. For example, our tool achieves a 30% higher compression ratio and is 71.9 times faster than the fastqz tool.
更多
查看译文
关键词
Altera Stratix V FPGA,Maxeler MPC-X2000 node,mapping algorithm,next generation sequencing platform,genomic data,reference based compression,FPGA acceleration
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要