GraphGuard: Detecting and Counteracting Training Data Misuse in Graph Neural Networks
CoRR(2023)
摘要
The emergence of Graph Neural Networks (GNNs) in graph data analysis and
their deployment on Machine Learning as a Service platforms have raised
critical concerns about data misuse during model training. This situation is
further exacerbated due to the lack of transparency in local training
processes, potentially leading to the unauthorized accumulation of large
volumes of graph data, thereby infringing on the intellectual property rights
of data owners. Existing methodologies often address either data misuse
detection or mitigation, and are primarily designed for local GNN models rather
than cloud-based MLaaS platforms. These limitations call for an effective and
comprehensive solution that detects and mitigates data misuse without requiring
exact training data while respecting the proprietary nature of such data. This
paper introduces a pioneering approach called GraphGuard, to tackle these
challenges. We propose a training-data-free method that not only detects graph
data misuse but also mitigates its impact via targeted unlearning, all without
relying on the original training data. Our innovative misuse detection
technique employs membership inference with radioactive data, enhancing the
distinguishability between member and non-member data distributions. For
mitigation, we utilize synthetic graphs that emulate the characteristics
previously learned by the target model, enabling effective unlearning even in
the absence of exact graph data. We conduct comprehensive experiments utilizing
four real-world graph datasets to demonstrate the efficacy of GraphGuard in
both detection and unlearning. We show that GraphGuard attains a near-perfect
detection rate of approximately 100% across these datasets with various GNN
models. In addition, it performs unlearning by eliminating the impact of the
unlearned graph with a marginal decrease in accuracy (less than 5%).
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要