HMLib: Efficient Data Transfer for HLS Using Host Memory.

FPGA(2023)

引用 0|浏览21
暂无评分
摘要
Streaming applications compose an important portion of the workloads that FPGAs may accelerate but suffer from inefficient data movement. The inefficiency stems from copying data indirectly into the FPGA DRAM rather than directly into its on-chip memory, substantially diminishing the end-to-end speedup, especially for small workloads (hundreds of kilobytes). AMD Xilinx's Host Memory IP (HMI) aims to address the data movement problem by exposing to the developer an High-Level Synthesis (HLS) interface that moves the data from the host directly to the FPGA's on-chip memory. However, using HMI purely for its interface without additional code changes incurred a 3.3x slowdown in comparison with the current programming model. The slowdown mainly originates from OpenCL call overhead and the kernel control logic unnecessarily switching states. To overcome these issues, we propose Host Memory Library (HMLib), an efficient HLS-based library that facilitates data transfer on behalf of the user. HMLib not only optimizes the runtime stack for efficient data transfer, but also provides HLS compatible and user-friendly interfaces. We demonstrate HMLib's effectiveness for streaming applications (Deflate compression and CRC32) with improvements of up to up to 36.2X over OpenCL-DDR and up to 79.5X over raw HMI for small-scale data while maintaining little-to-no performance loss for large scale inputs. We plan to open source our work in the future.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要