On the Faithfulness of Vision Transformer Explanations
CVPR 2024(2024)
摘要
To interpret Vision Transformers, post-hoc explanations assign salience
scores to input pixels, providing human-understandable heatmaps. However,
whether these interpretations reflect true rationales behind the model's output
is still underexplored. To address this gap, we study the faithfulness
criterion of explanations: the assigned salience scores should represent the
influence of the corresponding input pixels on the model's predictions. To
evaluate faithfulness, we introduce Salience-guided Faithfulness Coefficient
(SaCo), a novel evaluation metric leveraging essential information of salience
distribution. Specifically, we conduct pair-wise comparisons among distinct
pixel groups and then aggregate the differences in their salience scores,
resulting in a coefficient that indicates the explanation's degree of
faithfulness. Our explorations reveal that current metrics struggle to
differentiate between advanced explanation methods and Random Attribution,
thereby failing to capture the faithfulness property. In contrast, our proposed
SaCo offers a reliable faithfulness measurement, establishing a robust metric
for interpretations. Furthermore, our SaCo demonstrates that the use of
gradient and multi-layer aggregation can markedly enhance the faithfulness of
attention-based explanation, shedding light on potential paths for advancing
Vision Transformer explainability.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要