FARe: Fault-Aware GNN Training on ReRAM-based PIM Accelerators

Pratyush Dhingra,Chukwufumnanya Ogbogu, Biresh Kumar Joardar,Janardhan Rao Doppa, Ananth Kalyanaraman,Partha Pratim Pande

CoRR(2024)

引用 0|浏览9
暂无评分
摘要
Resistive random-access memory (ReRAM)-based processing-in-memory (PIM) architecture is an attractive solution for training Graph Neural Networks (GNNs) on edge platforms. However, the immature fabrication process and limited write endurance of ReRAMs make them prone to hardware faults, thereby limiting their widespread adoption for GNN training. Further, the existing fault-tolerant solutions prove inadequate for effectively training GNNs in the presence of faults. In this paper, we propose a fault-aware framework referred to as FARe that mitigates the effect of faults during GNN training. FARe outperforms existing approaches in terms of both accuracy and timing overhead. Experimental results demonstrate that FARe framework can restore GNN test accuracy by 47.6% on faulty ReRAM hardware with a ~1% timing overhead compared to the fault-free counterpart.
更多
查看译文
关键词
ReRAM,PIM,Fault-Tolerant Training,GNNs
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要