Online GNN Evaluation Under Test-time Graph Distribution Shifts
ICLR 2024(2024)
摘要
Evaluating the performance of a well-trained GNN model on real-world graphs
is a pivotal step for reliable GNN online deployment and serving. Due to a lack
of test node labels and unknown potential training-test graph data distribution
shifts, conventional model evaluation encounters limitations in calculating
performance metrics (e.g., test error) and measuring graph data-level
discrepancies, particularly when the training graph used for developing GNNs
remains unobserved during test time. In this paper, we study a new research
problem, online GNN evaluation, which aims to provide valuable insights into
the well-trained GNNs's ability to effectively generalize to real-world
unlabeled graphs under the test-time graph distribution shifts. Concretely, we
develop an effective learning behavior discrepancy score, dubbed LeBeD, to
estimate the test-time generalization errors of well-trained GNN models.
Through a novel GNN re-training strategy with a parameter-free optimality
criterion, the proposed LeBeD comprehensively integrates learning behavior
discrepancies from both node prediction and structure reconstruction
perspectives. This enables the effective evaluation of the well-trained GNNs'
ability to capture test node semantics and structural representations, making
it an expressive metric for estimating the generalization error in online GNN
evaluation. Extensive experiments on real-world test graphs under diverse graph
distribution shifts could verify the effectiveness of the proposed method,
revealing its strong correlation with ground-truth test errors on various
well-trained GNN models.
更多查看译文
关键词
Graph neural networks,Model evaluation,Distribution shift
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要