Intermediate-Layer Transferable Adversarial Attack With DNN Attention

IEEE ACCESS(2022)

引用 0|浏览12
暂无评分
摘要
The widespread deployment of deep learning models in practice necessitates an assessment of their vulnerability, particularly in security-sensitive areas. As a result, transfer-based adversarial attacks have elicited increasing interest in assessing the security of deep learning models. However, adversarial samples usually exhibit poor transferability over different models because of overfitting of the particular architecture and feature representation of a source model. To address this problem, the Intermediate Layer Attack with Attention guidance (IAA) is proposed to alleviate overfitting and enhance the black-box transferability. The IAA works on an intermediate layer / of the source model. Guided by the model's attention (i.e., gradients) to the features of layer 1, the attack algorithm seeks and undermines the key features that are likely to be adopted by diverse architectures. Significantly, IAA focuses on improving existing white-box attacks without introducing significant visual perceptual quality degradation. Namely, IAA maintains the white-box attack performance of the original algorithm while significantly enhancing its black-box transferability. Extensive experiments on ImageNet classifiers confirmed the effectiveness of our method. The proposed IAA outperformed all state-of-the-art benchmarks in various white-box and black-box settings, i.e., improving the success rate of BIM by 29.65% against normally trained models and 27.16% against defense models.
更多
查看译文
关键词
Deep learning, adversarial samples, black-box attack, transferability, intermediate layer, attention-guided
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要