A Unified Optimization Framework for Feature-based Transferable Attacks

IEEE Transactions on Information Forensics and Security(2024)

引用 0|浏览3
暂无评分
摘要
Despite the rapid progress and significant success of deep learning in a wide spectrum of fields, adversarial examples expose many security threats to deep learning models. Recently, an interesting property has been discovered that adversarial examples are transferable, which means adversarial examples targeting a given model can also attack another model. Therefore, many researchers are attracted by this property and work on how to improve the transferability of adversarial examples. Furthermore, compared to the traditional attack methods of disrupting output logits (dubbed logit-based attacks), recent works reveal that disrupting feature maps instead of logits can lead to more transferable adversarial examples (dubbed feature-based attacks). However, previous feature-based attacks mostly hold the intuitive designs of the optimization goals and specialization for certain scenarios with a lack of theoretical motivations and a unified framework. To overcome these limitations, we propose a Unified Feature-based Attack Framework, dubbed as UFAF, combining a dispersion loss and a distance loss, which unifies eight existing feature-based attacks. Furthermore, we also bridge the formulation gap between feature-based attacks and traditional logit-based attacks. With our UFAF, we propose an Entropy-Wasserstein (EW) attack by specifying the dispersion loss as Entropy and the distance loss as Wasserstein Distance, respectively. Besides, we provide theoretical analysis to guarantee the effectiveness of the proposed attack method. Extensive experimental results show the superior performance of our EW attack, which can outperform state-of-the-art attacks by 4.95% on attack success rates in untargeted attack settings, and by 1.95% on targeted transfer rates and 1.17% on target success rates in targeted attack settings. Moreover, our framework can help other feature-based attacks improve their performance by 7.7% in untargeted attack settings.
更多
查看译文
关键词
Adversarial attacks,untargeted attacks,targeted attack,transferable attacks,unified framework
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要