A Novel Backdoor Attack Adapted to Transfer Learning.

SmartWorld/UIC/ScalCom/DigitalTwin/PriComp/Meta(2022)

引用 0|浏览2
暂无评分
摘要
Recently, backdoor attacks pose a major security threat to deep neural networks. The attacker embeds hidden malicious behaviors into deep learning models that is activated only when the input contains specific triggers. Although backdoor attacks have a high success rate and are very stealthy, existing backdoor models often lose their attack capabilities after transfer learning. A critical question at present is how to ensure that the malicious behavior of the backdoor model is not disturbed by transfer learning. In this paper, we conducted a detailed empirical study of multiple existing backdoor defenses, and found that the existing defenses are generally based on the different recognition mechanisms of backdoor triggers and clean samples in the model. In order to resist existing defenses, we propose a method for generating backdoor triggers inversely based on the gradient information of model. In addition, for preventing the model from being disturbed by transfer learning, we use the modified Triplets-Loss to inject the backdoor only in the convolutional layer, and erase the backdoor information of the fully connected layer under the premise of ensuring the effect of the backdoor attack. Finally, we evaluate the attack and 4 potential defenses in several benchmark datasets, including MNIST, CIFAR10, GTSRB. The proposed attack method achieves 100% success rates while circumventing the interference of existing backdoor defenses.
更多
查看译文
关键词
neural networks,backdoor attacks,transfer learning
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要