GRIP: A Graph Neural Network Accelerator Architecture

arxiv(2023)

引用 78|浏览85
暂无评分
摘要
We present GRIP, a graph neural network accelerator architecture designed for low-latency inference. Accelerating GNNs is challenging because they combine two distinct types of computation: arithmetic-intensive vertex-centric operations and memory intensive edge-centric operations. GRIP splits GNN inference into a three edge-and vertex-centric execution phases that can be implemented in hardware. GRIP specializes each unit for the unique computational structure found in each phase. For vertex-centric phases, GRIP uses a high performance matrix multiply engine coupled with a dedicated memory subsystem for weights to improve reuse. For edge-centric phases, GRIP use multiple parallel prefetch and reduction engines to alleviate the irregularity in memory accesses. Finally, GRIP supports several GNN optimizations, including an optimization called vertex-tiling that increases the reuse of weight data. We evaluate GRIP by performing synthesis and place and route for a 28 nm implementation capable of executing inference for several widely-used GNN models (GCN, GraphSAGE, G-GCN, and GIN). Across several benchmark graphs, it reduces 99th percentile latency by a geometric mean of 17x and 23x compared to a CPU and GPU baseline, respectively, while drawing only 5 W.
更多
查看译文
关键词
Aggregates,Memory management,Bandwidth,Sparse matrices,Programming,Graph neural networks,Computational modeling,Accelerator architectures,neural networks,hardware,system-on-chip,graph neural networks
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要