A Unified Framework for Adversarial Attack and Defense in Constrained Feature Space

European Conference on Artificial Intelligence(2022)

引用 6|浏览20
暂无评分
摘要
The generation of feasible adversarial examples is necessary for properly assessing models that work on constrained feature space. However, it remains a challenging task to enforce constraints into attacks that were designed for computer vision. We propose a unified framework to generate feasible adversarial examples that satisfy given domain constraints. Our framework supports the use cases reported in the literature and can handle both linear and non-linear constraints. We instantiate our framework into two algorithms: a gradient-based attack that introduces constraints in the loss function to maximize, and a multi-objective search algorithm that aims for misclassification, perturbation minimization, and constraint satisfaction. We show that our approach is effective on two datasets from different domains, with a success rate of up to 100%, where state-of-the-art attacks fail to generate a single feasible example. In addition to adversarial retraining, we propose to introduce engineered non-convex constraints to improve model adversarial robustness. We demonstrate that this new defense is as effective as adversarial retraining. Our framework forms the starting point for research on constrained adversarial attacks and provides relevant baselines and datasets that future research can exploit.
更多
查看译文
关键词
Computer Vision: Adversarial learning, adversarial attack and defense methods,Constraint Satisfaction and Optimization: Constraints and Machine Learning,Constraint Satisfaction and Optimization: Constraint Satisfaction,Constraint Satisfaction and Optimization: Constraint Optimization,Search: Evolutionary Computation
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要