Understanding Performance Differences of FPGAs and GPUs

Jason Cong,Zhenman Fang,Michael Lo,Hanrui Wang,Jingxian Xu,Shaochong Zhang

2018 IEEE 26th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM)（2018）

引用 123|浏览117

暂无评分

摘要

This paper aims to better understand the performance differences between FPGAs and GPUs. We intentionally begin with a widely used GPU-friendly benchmark suite, Rodinia, and port 15 of the kernels onto FPGAs using HLS C. Then we propose an analytical model to compare their performance. We find that for 6 out of the 15 ported kernels, today's FPGAs can provide comparable performance or even achieve better performance than the GPU, while consuming an average of 28% of the GPU power. Besides lower clock frequency, FPGAs usually achieve a higher number of operations per cycle in each customized deep pipeline, but lower effective parallel factor due to the far lower off-chip memory bandwidth. With 4x more memory bandwidth, 8 out of the 15 FPGA kernels are projected to achieve at least half of the GPU kernel performance.

查看译文

关键词

FPGA,GPU,Analytical model,Performance comparison

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要