BELLA: Black box model Explanations by Local Linear Approximations

CoRR(2023)

引用 0|浏览21
暂无评分
摘要
In recent years, understanding the decision-making process of black-box models has become not only a legal requirement but also an additional way to assess their performance. However, the state of the art post-hoc interpretation approaches rely on synthetic data generation. This introduces uncertainty and can hurt the reliability of the interpretations. Furthermore, they tend to produce explanations that apply to only very few data points. This makes the explanations brittle and limited in scope. Finally, they provide scores that have no direct verifiable meaning. In this paper, we present BELLA, a deterministic model-agnostic post-hoc approach for explaining the individual predictions of regression black-box models. BELLA provides explanations in the form of a linear model trained in the feature space. Thus, its coefficients can be used directly to compute the predicted value from the feature values. Furthermore, BELLA maximizes the size of the neighborhood to which the linear model applies, so that the explanations are accurate, simple, general, and robust. BELLA can produce both factual and counterfactual explanations. Our user study confirms the importance of the desiderata we optimize, and our experiments show that BELLA outperforms the state-of-the-art approaches on these desiderata.
更多
查看译文
关键词
black box model explanations,approximations,linear,local
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要