Throughput-oriented kernel porting onto FPGAs

DAC(2013)

引用 9|浏览55
暂无评分
摘要
Reconfigurable devices are often employed in heterogeneous systems due to their low power and parallel processing advantages. An important usability requirement is the support of a homogeneous programming interface. Nevertheless, homogeneous programming interfaces do not eliminate the need for code tweaking to enable efficient mapping of the computation across heterogeneous architectures. In this work we propose a code optimization framework which analyzes and restructures CUDA kernels that are optimized for GPU devices in order to facilitate synthesis of high-throughput custom accelerators on FPGAs. The proposed framework enables efficient performance porting without manual code tweaking or annotation by the user. A hierarchical region graph in tandem with code motions and graph coloring of array variables is employed to restructure the kernel for high throughput execution on FPGAs.
更多
查看译文
关键词
homogeneous programming interface,heterogeneous architecture,code optimization framework,efficient performance,efficient mapping,heterogeneous system,throughput-oriented kernel,code tweaking,manual code tweaking,hierarchical region graph,code motion,fpga,graph theory,field programmable gate arrays,placement,physical design
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要