Improving Adversarial Transferability by Stable Diffusion.
CoRR(2023)
摘要
Deep neural networks (DNNs) are susceptible to adversarial examples, which
introduce imperceptible perturbations to benign samples, deceiving DNN
predictions. While some attack methods excel in the white-box setting, they
often struggle in the black-box scenario, particularly against models fortified
with defense mechanisms. Various techniques have emerged to enhance the
transferability of adversarial attacks for the black-box scenario. Among these,
input transformation-based attacks have demonstrated their effectiveness. In
this paper, we explore the potential of leveraging data generated by Stable
Diffusion to boost adversarial transferability. This approach draws inspiration
from recent research that harnessed synthetic data generated by Stable
Diffusion to enhance model generalization. In particular, previous work has
highlighted the correlation between the presence of both real and synthetic
data and improved model generalization. Building upon this insight, we
introduce a novel attack method called Stable Diffusion Attack Method (SDAM),
which incorporates samples generated by Stable Diffusion to augment input
images. Furthermore, we propose a fast variant of SDAM to reduce computational
overhead while preserving high adversarial transferability. Our extensive
experimental results demonstrate that our method outperforms state-of-the-art
baselines by a substantial margin. Moreover, our approach is compatible with
existing transfer-based attacks to further enhance adversarial transferability.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要