Succinct Interaction-Aware Explanations
arxiv(2024)
摘要
SHAP is a popular approach to explain black-box models by revealing the
importance of individual features. As it ignores feature interactions, SHAP
explanations can be confusing up to misleading. NSHAP, on the other hand,
reports the additive importance for all subsets of features. While this does
include all interacting sets of features, it also leads to an exponentially
sized, difficult to interpret explanation. In this paper, we propose to combine
the best of these two worlds, by partitioning the features into parts that
significantly interact, and use these parts to compose a succinct,
interpretable, additive explanation. We derive a criterion by which to measure
the representativeness of such a partition for a models behavior, traded off
against the complexity of the resulting explanation. To efficiently find the
best partition out of super-exponentially many, we show how to prune
sub-optimal solutions using a statistical test, which not only improves runtime
but also helps to detect spurious interactions. Experiments on synthetic and
real world data show that our explanations are both more accurate resp. more
easily interpretable than those of SHAP and NSHAP.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要