A Benchmark For Interpretability Methods In Deep Neural Networks

Sara Hooker,Dumitru Erhan,Pieter-Jan Kindermans,Been Kim

ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019)（2019）

引用 638|浏览340

暂无评分

摘要

We propose an empirical measure of the approximate accuracy of feature importance estimates in deep neural networks. Our results across several large-scale image classification datasets show that many popular interpretability methods produce estimates of feature importance that are not better than a random designation of feature importance. Only certain ensemble based approaches-VarGrad and SmoothGrad-Squared-outperform such a random assignment of importance. The manner of ensembling remains critical, we show that some approaches do no better then the underlying method but carry a far higher computational burden.

查看译文

关键词

random assignment

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要