MSPipe: Efficient Temporal GNN Training via Staleness-aware Pipeline
CoRR(2024)
摘要
Memory-based Temporal Graph Neural Networks (MTGNNs) are a class of temporal
graph neural networks that utilize a node memory module to capture and retain
long-term temporal dependencies, leading to superior performance compared to
memory-less counterparts. However, the iterative reading and updating process
of the memory module in MTGNNs to obtain up-to-date information needs to follow
the temporal dependencies. This introduces significant overhead and limits
training throughput. Existing optimizations for static GNNs are not directly
applicable to MTGNNs due to differences in training paradigm, model
architecture, and the absence of a memory module. Moreover, they do not
effectively address the challenges posed by temporal dependencies, making them
ineffective for MTGNN training. In this paper, we propose MSPipe, a general and
efficient framework for MTGNNs that maximizes training throughput while
maintaining model accuracy. Our design addresses the unique challenges
associated with fetching and updating node memory states in MTGNNs by
integrating staleness into the memory module. However, simply introducing a
predefined staleness bound in the memory module to break temporal dependencies
may lead to suboptimal performance and lack of generalizability across
different models and datasets. To solve this, we introduce an online pipeline
scheduling algorithm in MSPipe that strategically breaks temporal dependencies
with minimal staleness and delays memory fetching to obtain fresher memory
states. Moreover, we design a staleness mitigation mechanism to enhance
training convergence and model accuracy. We provide convergence analysis and
prove that MSPipe maintains the same convergence rate as vanilla sample-based
GNN training. Experimental results show that MSPipe achieves up to 2.45x
speed-up without sacrificing accuracy, making it a promising solution for
efficient MTGNN training.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要