Exploiting High-Bandwidth Memory for FPGA- Acceleration of Inference on Sum-Product Networks

2022 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)(2022)

引用 0|浏览3
暂无评分
摘要
Due to the memory wall becoming increasingly problematic in high-performance computing, there is a steady push to improve memory architectures, mainly focusing on better bandwidth as well as latency. One of the results of this push is the development of High-Bandwidth Memory (HBM) which is an alternative to the regular DRAM typically used by acceleratorcards. This work adapts an existing accelerator architecture for inference on Sum-Product Networks (SPN) to exploit the HBM present on more recent high-performance FPGA-accelerator cards. The evaluation shows that the use of HBM enables almost linear scaling of the performance due to the embarrassingly parallel nature of batch-wise SPN inference. It is also shown that the only hindrance to this scaling is the limited bandwidth available for data-transfers between host and FPGA. Even with this bottleneck, the prior FPGA-based implementation is outperformed by up to 1.50x (geo.-mean 1.29x). Similarly, the CPU and GPU baselines are outperformed by up to 2.4x (geo.mean 1.6x) and 8.4x (geo.-mean 6.9x) respectively. Based on the evaluation, the scaling potential of HBM-based FPGA-accelerators is explored to give an outlook on what is to come with future generations of PCIe-based interfaces.
更多
查看译文
关键词
Sum-Product Network,Probabilistic Models,Machine Learning,High-Bandwidth Memory,FPGA
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要