Multi-scale Features Destructive Universal Adversarial Perturbations.

Huangxinyue Wu, Haoran Li,Jinhong Zhang,Wei Zhou, Lei Guo,Yunyun Dong

ICICS(2023)

引用 0|浏览1
暂无评分
摘要
Deep Neural Networks (DNNs) are suffering from adversarial attacks, where some imperceptible perturbations are added into examples and cause incorrect predictions. Generally, there are two types of adversarial attack methods, i.e., image-dependent and image agnostic. As for the first one, Image-dependent attacks involve crafting unique adversarial perturbations for each clean example. As for the latter case, image-agnostic attacks create a universal adversarial perturbation (UAP) that can fool the target model for all clean examples. However, existing UAP methods only utilize the output of the target DNNs within a limited magnitude, resulting in an ineffective application of UAP to the entire feature extraction process of the DNNs. In this paper, we consider the difference between the mid-level features of the clean example and their corresponding adversarial example in the different intermediate layers of target DNN. Specifically, we maximize the impact of the adversarial examples in the forward propagation process by pulling apart the feature representations of the clean and adversarial examples. Moreover, to achieve targeted and non-targeted attacks, we design a loss function that highlights the UAP feature representation to guide the direction of perturbations in the feature layers. Furthermore, to reduce the training time and training parameters, we adopt a direct optimization approach to craft UAPs and experimentally demonstrate that we can achieve a higher fooling rate with fewer examples. Extensive experimental results show that our approach outperforms state-of-the-art methods in both non-targeted and targeted universal attacks.
更多
查看译文
关键词
features,multi-scale
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要