Kernel Ridge Regression-Based Graph Dataset Distillation.

KDD(2023)

引用 8|浏览36
暂无评分
摘要
The huge volume of emerging graph datasets has become a double-bladed sword for graph machine learning. On the one hand, it empowers a myriad of graph neural networks (GNNs) with strong empirical performance. On the other hand, training modern graph neural networks on huge graph data is computationally expensive. How to distill the given graph dataset while retaining most of the trained models' performance is a challenging problem. Existing efforts approach this problem by solving meta-learning-based bilevel optimization objectives. A major hurdle lies in that the exact solutions of these methods are computationally intensive and thus, most, if not all, of them are solved by approximate strategies which in turn hurt the distillation performance. In this paper, inspired by the recent advances in neural network kernel methods, we adopt a kernel ridge regression-based meta-learning objective which has a feasible exact solution. However, the computation of graph neural tangent kernel is very expensive, especially in the context of dataset distillation. In response, we design a graph kernel, named LiteGNTK, tailored for the dataset distillation problem which is closely related to the classic random walk graph kernel. An effective model named Kernel ridge regression-based graph Dataset Distillation (KiDD) and its variants are proposed. KiDD shows high efficiency in both the forward and backward propagation processes. At the same time, KiDD shows strong empirical performance over 7 real-world datasets compared with the state-of-the-art distillation methods. Thanks to the ability to find the exact solution of the distillation objective, the learned training graphs by KiDD can sometimes even outperform the original whole training set with as few as 1.65% training graphs.
更多
查看译文
关键词
graph machine learning,graph dataset distillation
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要