High Throughput Large Scale Sorting on a CPU-FPGA Heterogeneous Platform

2016 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)(2016)

引用 31|浏览51
暂无评分
摘要
Recently accelerating sorting using FPGA has been of growing interest in both industry and academia. However, the supported size of data set is usually small for FPGA-only sorting designs due to limited on-chip memory. In this paper, we propose a design to speed-up large scale sorting using a CPU-FPGA heterogeneous platform. We first optimize a fully-pipelined merge sort based accelerator and employ several such designs working in parallel on FPGA. The partial results from the FPGA are then merged on the CPU. On the Intel QuickAssist QPI FPGA Platform, for a range of data set size, we improve the throughput by 2.9× and 1.9× compared with CPU-only and FPGA-only baselines, respectively. Compared with the state-of-the-art FPGA implementation for sorting, our design achieves 2.3× throughput improvement.
更多
查看译文
关键词
FPGA,Merge sort,Heterogeneous architecture
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要