Data Transfer Modeling and Optimization in Reconfigurable Multi-Accelerator Systems

2019 14th International Symposium on Reconfigurable Communication-centric Systems-on-Chip (ReCoSoC)(2019)

引用 1|浏览6
暂无评分
摘要
The use of accelerator-centric processing architectures in different application scenarios, ranging from the cloud to the edge, is nowadays a reality. However, the always increasing stringent operating conditions and requirements continues to push the research around hardware-based processing architectures, which are able to provide medium to high computing performance capabilities while at the same time supporting energy-efficient execution. In addition, reconfigurable devices (i.e., FPGAs) provide another degree of freedom by enabling software-like flexibility by time-multiplexing the computing resources. Nevertheless, bus-based computing platforms still face architectural bottlenecks when data transfers are not handled efficiently. In this paper, the communication overhead in a reconfigurable multi-accelerator architecture for high-performance embedded computing is analyzed and modeled. The obtained models are then used to predict the acceleration perfomance and to evaluate two different patterns for data transfers: on the one hand, a basic approach in which data preparation and DMA transfers are executed sequentially; on the other hand, a pipelined approach in which data preparation and DMA transfers are executed in parallel. The evaluation method is based on well-known accelerator benchmarks from the MachSuite suite. Experimental results show that using a pipelined data management approach increases performance up to 2.6x when compared to the sequential alternative, and up to 26.46x when compared with a bare-metal execution of the accelerators (i.e., without using the reconfigurable multi-accelerator processing architecture nor an Operating System).
更多
查看译文
关键词
FPGAs,Communication Modeling,Dynamic and Partial Reconfiguration,Hardware Architectures
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要