Feature Attribution with Necessity and Sufficiency via Dual-stage Perturbation Test for Causal Explanation
CoRR(2024)
摘要
We investigate the problem of explainability in machine learning.To address
this problem, Feature Attribution Methods (FAMs) measure the contribution of
each feature through a perturbation test, where the difference in prediction is
compared under different perturbations.However, such perturbation tests may not
accurately distinguish the contributions of different features, when their
change in prediction is the same after perturbation.In order to enhance the
ability of FAMs to distinguish different features' contributions in this
challenging setting, we propose to utilize the probability (PNS) that
perturbing a feature is a necessary and sufficient cause for the prediction to
change as a measure of feature importance.Our approach, Feature Attribution
with Necessity and Sufficiency (FANS), computes the PNS via a perturbation test
involving two stages (factual and interventional).In practice, to generate
counterfactual samples, we use a resampling-based approach on the observed
samples to approximate the required conditional distribution.Finally, we
combine FANS and gradient-based optimization to extract the subset with the
largest PNS.We demonstrate that FANS outperforms existing feature attribution
methods on six benchmarks.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要