MEGA: A Memory-Efficient GNN Accelerator Exploiting Degree-Aware Mixed-Precision Quantization
2024 IEEE International Symposium on High-Performance Computer Architecture (HPCA)(2023)
摘要
Graph Neural Networks (GNNs) are becoming a promising technique in various
domains due to their excellent capabilities in modeling non-Euclidean data.
Although a spectrum of accelerators has been proposed to accelerate the
inference of GNNs, our analysis demonstrates that the latency and energy
consumption induced by DRAM access still significantly impedes the improvement
of performance and energy efficiency. To address this issue, we propose a
Memory-Efficient GNN Accelerator (MEGA) through algorithm and hardware
co-design in this work. Specifically, at the algorithm level, through an
in-depth analysis of the node property, we observe that the data-independent
quantization in previous works is not optimal in terms of accuracy and memory
efficiency. This motivates us to propose the Degree-Aware mixed-precision
quantization method, in which a proper bitwidth is learned and allocated to a
node according to its in-degree to compress GNNs as much as possible while
maintaining accuracy. At the hardware level, we employ a heterogeneous
architecture design in which the aggregation and combination phases are
implemented separately with different dataflows. In order to boost the
performance and energy efficiency, we also present an Adaptive-Package format
to alleviate the storage overhead caused by the fine-grained bitwidth and
diverse sparsity, and a Condense-Edge scheduling method to enhance the data
locality and further alleviate the access irregularity induced by the extremely
sparse adjacency matrix in the graph. We implement our MEGA accelerator in a
28nm technology node. Extensive experiments demonstrate that MEGA can achieve
an average speedup of 38.3x, 7.1x, 4.0x, 3.6x and 47.6x, 7.2x, 5.4x, 4.5x
energy savings over four state-of-the-art GNN accelerators, HyGCN, GCNAX, GROW,
and SGCN, respectively, while retaining task accuracy.
更多查看译文
关键词
GNN,Mixed-Precision Quantization,Accelerator
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要