Generating Less Certain Adversarial Examples Improves Robust Generalization
arXiv (Cornell University)(2023)
摘要
This paper revisits the robust overfitting phenomenon of adversarial
training. Observing that models with better robust generalization performance
are less certain in predicting adversarially generated training inputs, we
argue that overconfidence in predicting adversarial examples is a potential
cause. Therefore, we hypothesize that generating less certain adversarial
examples improves robust generalization, and propose a formal definition of
adversarial certainty that captures the variance of the model's predicted
logits on adversarial examples. Our theoretical analysis of synthetic
distributions characterizes the connection between adversarial certainty and
robust generalization. Accordingly, built upon the notion of adversarial
certainty, we develop a general method to search for models that can generate
training-time adversarial inputs with reduced certainty, while maintaining the
model's capability in distinguishing adversarial examples. Extensive
experiments on image benchmarks demonstrate that our method effectively learns
models with consistently improved robustness and mitigates robust overfitting,
confirming the importance of generating less certain adversarial examples for
robust generalization.
更多查看译文
关键词
certain adversarial examples,robust
AI 理解论文
溯源树
样例
![](https://originalfileserver.aminer.cn/sys/aminer/pubs/mrt_preview.jpeg)
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要