Application Partitioning on FPGA Clusters: Inference over Decision Tree Ensembles

2018 28th International Conference on Field Programmable Logic and Applications (FPL)(2018)

引用 24|浏览53
暂无评分
摘要
In the same way multi-core and CPU clusters are used for large problems, multi-FPGA clusters are needed to tackle applications not fitting within a single FPGA, such as machine learning methods based on large models. Recent FPGA deployments in datacenters offer flexible pools of FPGAs that can be used in different configurations. In addition to those of a typical cluster architecture, FPGA clusters often have added capabilities such as being hosted by large server nodes (e.g., Amazon F1 Instance), or a network backbone directly connecting multiple FPGAs (e.g., Microsoft Catapult). While such designs open up many opportunities, mapping application logic onto a pool of FPGA resources is a non trivial task. It requires partitioning the application across multiple FPGAs, inter-FPGA communication management of multiple data stream classes, and balancing communication-computation bandwidth. In this paper, we explore and develop techniques for mapping a resource-intensive machine learning application, namely inference over decision tree ensembles on a datacenter-grade FPGA cluster. The FPGA cluster is built out of 20 Microsoft Catapult FPGA boards with a flexible inter-FPGA network topology. We developed a lightweight inter-FPGA communication protocol and routing layer to facilitate the communication between different parts of the application. Our evaluation provides insights on the overall performance benefits of the design and outlines some of the techniques needed to efficiently map applications onto a pool of distributed FPGAs.
更多
查看译文
关键词
FPGA cluster,Multi-FPGA design,FPGA programming
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要