Self-adaptive logit balancing for deep neural network robustness: Defence and detection of adversarial attacks.

Neurocomputing(2023)

引用 1|浏览7
暂无评分
摘要
With the widespread applications of Deep Neural Networks (DNNs), the safety of DNNs has become a significant issue. The vulnerability of the neural networks against adversarial examples deepens concerns about the safety of DNNs applications. This paper proposed a novel defence method to improve the adversarial robustness of DNN classifiers without using adversarial training. This method introduces two new loss functions. First, a zero-cross-entropy loss is used to punish overconfidence and find the appropriate confidence for different instances. Second, a logit balancing loss is proposed to protect DNNs from non-targeted attacks by regularising incorrect classes’ logits distribution. This method achieved competitive adversarial robustness compared to advanced adversarial training methods. Meanwhile, a novel robustness diagram is proposed to analyse, interpret and visualise the robustness of DNN classifiers against adversarial attacks. Furthermore, a Log-Softmax-pattern-based adversarial attack detection method is proposed. This detection method can distinguish clean inputs and multiple adversarial attacks via one multi-classification MLP. In particular, it is state-of-the-art in identifying white-box gradient-based attacks; it achieved at least 95.5% accuracy for classifying four white-box gradient-based attacks with maximum 0.1% false positive ratio.
更多
查看译文
关键词
Machine learning security,Adversarial examples,Adversarial robustness,Adversarial attacks detection,Deep neural networks
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要