Local Gradients Smoothing: Defense against localized adversarial attacks

2019 IEEE Winter Conference on Applications of Computer Vision (WACV)(2018)

引用 144|浏览321
暂无评分
摘要
Deep neural networks (DNNs) have shown vulnerability to adversarial attacks, i.e., carefully perturbed inputs designed to mislead the network at inference time. Recently introduced localized attacks, Localized and Visible Adversarial Noise (LaVAN) and Adversarial patch, pose a new challenge to deep learning security by adding adversarial noise only within a specific region without affecting the salient objects in an image. Driven by the observation that such attacks introduce concentrated high-frequency changes at a particular image location, we have developed an effective method to estimate noise location in gradient domain and transform those high activation regions caused by adversarial noise in image domain while having minimal effect on the salient object that is important for correct classification. Our proposed Local Gradients Smoothing (LGS) scheme achieves this by regularizing gradients in the estimated noisy region before feeding the image to DNN for inference. We have shown the effectiveness of our method in comparison to other defense methods including Digital Watermarking, JPEG compression, Total Variance Minimization (TVM) and Feature squeezing on ImageNet dataset. In addition, we systematically study the robustness of the proposed defense mechanism against Back Pass Differentiable Approximation (BPDA), a state of the art attack recently developed to break defenses that transform an input sample to minimize the adversarial effect. Compared to other defense mechanisms, LGS is by far the most resistant to BPDA in localized adversarial attack setting.
更多
查看译文
关键词
deep neural networks,inference time,deep learning security,salient object,high-frequency changes,noise location,gradient domain,high activation regions,image domain,minimal effect,estimated noisy region,defense mechanism,adversarial effect,localized adversarial attack setting,local gradients smoothing,localized adversarial attacks,image location,total variance minimization,digital watermarking,JPEG compression,feature squeezing,ImageNet dataset,back pass differentiable approximation,BPDA
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要