Fleet: A Framework for Massively Parallel Streaming on FPGAs

ASPLOS '20: Architectural Support for Programming Languages and Operating Systems Lausanne Switzerland March, 2020(2020)

引用 34|浏览159
暂无评分
摘要
We present Fleet, a framework that offers a massively parallel streaming model for FPGAs and is effective in a number of domains well-suited for FPGA acceleration, including parsing, compression, and machine learning. Fleet requires the user to specify RTL for a processing unit that serially processes every input token in a stream, a far simpler task than writing a parallel processing unit. It then takes the user's processing unit and generates a hardware design with many copies of the unit as well as memory controllers to feed the units with separate streams and drain their outputs. Fleet includes a Chisel-based processing unit language. The language maintains Chisel's low-level performance control while adding a few productivity features, including automatic handling of ready-valid signaling and a native and automatically pipelined BRAM type. We evaluate Fleet on six different applications, including JSON parsing and integer compression, fitting hundreds of Fleet processing units on the Amazon F1 FPGA and outperforming CPU implementations by over 400x and GPU implementations by over 9x in performance per watt while requiring a similar number of lines of code.
更多
查看译文
关键词
FPGAs,HDLs,streaming,identical processors
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要