An FPGA-Based BWT Accelerator for Bzip2 Data Compression

2019 IEEE 27th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM)(2019)

引用 17|浏览127
暂无评分
摘要
The Burrows-Wheeler Transform (BWT) has played an important role in lossless data compression algorithms. To achieve a good compression ratio, the BWT block size needs to be several hundreds of kilobytes, which requires a large amount of on-chip memory resources and limits effective hardware implementations. In this paper, we analyze the bottleneck of the BWT acceleration and present a novel design to map the anti-sequential suffix sorting algorithm to FPGAs. Our design can perform BWT with a block size of up to 500KB (i.e., bzip2 level 5 compression) on the Xilinx Virtex UltraScale+ VCU1525 board, while the state-of-art FPGA implementation can only support 4KB block size. Experiments show our FPGA design can achieve ~2x speedup compared to the best CPU implementation using standard large Corpus benchmarks.
更多
查看译文
关键词
Sorting,Field programmable gate arrays,Dictionaries,System-on-chip,Software algorithms,Software,Hardware
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要