HLScope+, : Fast and accurate performance estimation for FPGA HLS.

ICCAD(2017)

引用 42|浏览54
暂无评分
摘要
High-level synthesis (HLS) tools have vastly increased the productivity of field-programmable gate array (FPGA) programmers with design automation and abstraction. However, the side effect is that many architectural details are hidden from the programmers. As a result, programmers who wish to improve the performance of their design often have difficulty identifying the performance bottleneck. It is true that current HLS tools provide some estimate of the performance with a fixed loop count, but they often fail to do so for programs with input-dependent execution behavior. Also, their external memory latency model does not accurately fit the actual bus-based shared memory architecture. This work describes a high-level cycle estimation methodology to solve these problems. To reduce the time overhead, we propose a cycle estimation process that is combined with the HLS software simulation. We also present an automatic code instrumentation technique that finds the reason for stall accurately in on-board execution. The experimental results show that our framework provides a cycle estimate with an average error rate of 1.1% and 5.0% for compute- and DRAM-bound modules, respectively, for ADM-PCIE-7V3 board. The proposed method is about two orders of magnitude faster than the FPGA bitstream generation.
更多
查看译文
关键词
cycle estimation process,HLS software simulation,automatic code instrumentation technique,FPGA bitstream generation,high-level synthesis,field-programmable gate array programmers,external memory latency model,memory architecture,high-level cycle estimation methodology,HLScope+
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要