Adversarial Deep Learning With Stackelberg Games

NEURAL INFORMATION PROCESSING (ICONIP 2019), PT IV(2019)

引用 3|浏览215
暂无评分
摘要
Deep networks are vulnerable to adversarial attacks from malicious adversaries. Currently, many adversarial learning algorithms are designed to exploit such vulnerabilities in deep networks. These methods focus on attacking and retraining deep networks with adversarial examples to do either feature manipulation or label manipulation or both. In this paper, we propose a new adversarial learning algorithm for finding adversarial manipulations to deep networks. We formulate adversaries who optimize game-theoretic payoff functions on deep networks doing multi-label classifications. We model the interactions between a classifier and an adversary from a game-theoretic perspective and formulate their strategies into a Stackelberg game associated with a two-player problem. Then we design algorithms to solve for the Nash equilibrium, which is a pair of strategies from which there is no incentive for either the classifier or the adversary to deviate. In designing attack scenarios, the adversary's objective is to deliberately make small changes to test data such that attacked samples are undetected. Our results illustrate that game-theoretic modelling is significantly effective in securing deep learning models against performance vulnerabilities attached by intelligent adversaries.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要