Data Movement Accelerator Engines on a Prototype Power10 Processor

IEEE Micro(2022)

引用 1|浏览14
暂无评分
摘要
This article presents the design and implementation of active messaging engines (AMEs) on an IBM Power10 prototype chip. AMEs are tiny, simple, but fully programmable 64-bit processors, for offloading operations related to data movement. AMEs can offload the execution flow of the message passing interface and other messaging stacks from the host central processing unit, enabling truly asynchronous progress to overlap computation and communication. The AMEs are implemented as onboard OpenCAPI-compliant accelerators, leveraging existing OpenCAPI infrastructure. As realized in a 7-nm technology, each AME takes 0.034 mm2 of silicon area and 4.1 mW of power. AME performance is evaluated across several contiguous and noncontiguous memory copy scenarios. AMEs can perform up to the bandwidth limit of their access path to the main memory (32 GB/s) and incur a per-request overhead of about 600 ns. These results indicate that AMEs will confer advantages to general messaging libraries for processing, sending, and receiving on-node and off-node messages.
更多
查看译文
关键词
Accelerators,Prototypes,Engines,System-on-chip,Memory management,Hardware
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要