StreamPIM: Streaming Matrix Computation in Racetrack Memory.

Yuda An, Yunxiao Tang,Shushu Yi,Li Peng, Xiurui Pan,Guangyu Sun, Zhaochu Luo,Qiao Li,Jie Zhang

International Symposium on High-Performance Computer Architecture(2024)

引用 0|浏览1
暂无评分
摘要
Racetrack memory (RM) techniques have become promising solutions to resolve the memory wall issue as they increase memory density, reduce energy consumption and are capable of building processing-in-memory (PIM) architectures. RM can place arithmetic logic units in or near its memory arrays to process tasks offloaded by the host. While there already exist multiple studies of processing in RM, these solutions, unfortunately, suffer from data transfer overheads imposed by the loose coupling of the memory core and the computation units. To address this issue, we propose StreamPIM, a new processing-in-RM architecture, which tightly couples the memory core and the computation units. Specifically, StreamPIM directly constructs a matrix processor from domain-wall nanowires without the usage of CMOS-based computation units. It also designs a domainwall nanowire-based bus, which can eliminate electromagnetic conversion. StreamPIM further optimizes the performance by leveraging RM internal parallelism. Our evaluation results show that StreamPIM achieves 39.1 × higher performance and saves 58.4 × energy consumption, compared with the traditional computing platform.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要