ML-LOO: Detecting Adversarial Examples with Feature Attribution
national conference on artificial intelligence, 2020.
Deep neural networks obtain state-of-the-art performance on a series of tasks. However, they are easily fooled by adding a small adversarial perturbation to input. The perturbation is often human imperceptible on image data. We observe a significant difference in feature attributions of adversarially crafted examples from those of origi...More
PPT (Upload PPT)