An Information Geometric Perspective to Adversarial Attacks and Defenses

IEEE International Joint Conference on Neural Network (IJCNN)(2022)

引用 0|浏览4
暂无评分
摘要
Deep learning models have achieved state-of-the-art accuracy in complex tasks, sometimes outperforming human-level accuracy. Yet, they suffer from vulnerabilities known as adversarial attacks, which are imperceptible input perturbations that fool the models on inputs that were originally classified correctly. The adversarial problem remains poorly understood and commonly thought to be an inherent weakness of deep learning models. We argue that understanding and alleviating the adversarial phenomenon may require us to go beyond the Euclidean view and consider the relationship between the input and output spaces as a statistical manifold with the Fisher Information as its Riemannian metric. Under this information geometric view, the optimal attack is constructed as the direction corresponding to the highest eigenvalue of the Fisher Information Matrix - called the Fisher spectral attack. We show that an orthogonal transformation of the data cleverly alters its manifold by keeping the highest eigenvalue but changing the optimal direction of attack; thus deceiving the attacker into adopting the wrong direction. We demonstrate the defensive capabilities of the proposed orthogonal scheme - against the Fisher spectral attack and the popular fast gradient sign method - on standard networks, e.g., LeNet and MobileNetV2 for benchmark data sets, MNIST and CIFAR-10.
更多
查看译文
关键词
deep neural network,adversarial defense,information geometry
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要