Exact and Heuristic Allocation of Multi-kernel Applications to Multi-FPGA Platforms

Proceedings of the 56th Annual Design Automation Conference 2019(2019)

引用 16|浏览33
暂无评分
摘要
FPGA-based accelerators demonstrated high energy efficiency compared to GPUs and CPUs. However, single FPGA designs may not achieve sufficient task parallelism. In this work, we optimize the mapping of high-performance multi-kernel applications, like Convolutional Neural Networks, to multi-FPGA platforms. First, we formulate the system level optimization problem, choosing within a huge design space the parallelism and number of compute units for each kernel in the pipeline. Then we solve it using a combination of Geometric Programming, producing the optimum performance solution given resource and DRAM bandwidth constraints, and a heuristic allocator of the compute units on the FPGA cluster.
更多
查看译文
关键词
task parallelism,geometric programming,DRAM bandwidth constraints,exact allocation,huge design space,system level optimization problem,Convolutional Neural Networks,high-performance multikernel applications,single FPGA designs,high energy efficiency,FPGA-based accelerators,multiFPGA platforms,heuristic allocation,FPGA cluster,heuristic allocator,compute units
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要