The Mirrored Influence Hypothesis: Efficient Data Influence Estimation by Harnessing Forward Passes
CoRR(2024)
摘要
Large-scale black-box models have become ubiquitous across numerous
applications. Understanding the influence of individual training data sources
on predictions made by these models is crucial for improving their
trustworthiness. Current influence estimation techniques involve computing
gradients for every training point or repeated training on different subsets.
These approaches face obvious computational challenges when scaled up to large
datasets and models.
In this paper, we introduce and explore the Mirrored Influence Hypothesis,
highlighting a reciprocal nature of influence between training and test data.
Specifically, it suggests that evaluating the influence of training data on
test predictions can be reformulated as an equivalent, yet inverse problem:
assessing how the predictions for training samples would be altered if the
model were trained on specific test samples. Through both empirical and
theoretical validations, we demonstrate the wide applicability of our
hypothesis. Inspired by this, we introduce a new method for estimating the
influence of training data, which requires calculating gradients for specific
test samples, paired with a forward pass for each training point. This approach
can capitalize on the common asymmetry in scenarios where the number of test
samples under concurrent examination is much smaller than the scale of the
training dataset, thus gaining a significant improvement in efficiency compared
to existing approaches.
We demonstrate the applicability of our method across a range of scenarios,
including data attribution in diffusion models, data leakage detection,
analysis of memorization, mislabeled data detection, and tracing behavior in
language models. Our code will be made available at
https://github.com/ruoxi-jia-group/Forward-INF.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要