Defense Against Adversarial Attacks Using Topology Aligning Adversarial Training

IEEE Transactions on Information Forensics and Security(2024)

引用 0|浏览1
暂无评分
摘要
Recent works have indicated that deep neural networks (DNNs) are vulnerable to adversarial attacks, wherein an attacker perturbs an input example with human-imperceptible noise that can easily fool the DNNs, resulting in incorrect predictions. This severely limits the application of deep learning in security-critical scenarios, such as face authentication. Adversarial training (AT) is one of the most practical approaches to strengthening the robustness of DNNs. However, existing AT-based methods treat each training sample independently, thereby ignoring the underlying topological structure in the training data. To this end, in this paper, we take full advantage of the topology information and introduce a Topology Aligning Adversarial Training (TAAT) algorithm. TAAT aims to encourage the trained model to maintain consistency in the topological structure within the feature space of both natural and adversarial examples. To ensure the stability and efficiency of topology alignment, we further introduce a novel Knowledge-Guided (KG) training scheme. This scheme explicitly aligns local logit outputs with global topological structures, leveraging a robust auxiliary model to enhance the target model’s performance. To verify the effectiveness of the proposed method, we conduct extensive experiments on popular benchmark datasets ( e.g ., CIFAR and ImageNet) and evaluate the robustness against state-of-the-art adversarial attacks ( e.g ., PGD-attack and AutoAttack). The experimental results demonstrate that the proposed method has superior robustness over the previous state-of-the-art methods. Our code and pre-trained models are available at https://github.com/SkyKuang/TAAT.
更多
查看译文
关键词
Adversarial training,model robustness,topology aligning,knowledge distillation
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要