AI helps you reading Science

AI generates interpretation videos

AI extracts and analyses the key points of the paper to generate videos automatically


pub
Go Generating

AI Traceability

AI parses the academic lineage of this thesis


Master Reading Tree
Generate MRT

AI Insight

AI extracts a summary of this paper


Weibo:
All neural networks we study are feed-forward networks consisting of multiple layers F i taking as input the result of previous layers

Adversarial Examples Are Not Easily Detected: Bypassing Ten Detection Methods.

AISec@CCS, (2017): 3-14

Cited by: 1122|Views294
EI

Abstract

Neural networks are known to be vulnerable to adversarial examples: inputs that are close to natural inputs but classified incorrectly. In order to better understand the space of adversarial examples, we survey ten recent proposals that are designed for detection and compare their efficacy. We show that all can be defeated by constructing...More

Code:

Data:

0
Introduction
  • Recent years have seen rapid growth in the area of machine learning. Neural networks, an idea that dates back decades, have been a driving force behind this rapid advancement.
  • Given a natural image x, an adversary can produce a visually similar image x ′ that has a different classification.
  • Such an instance x ′ is known as an adversarial example [39], and they have been shown to exist in most domains that neural networks are used.
Highlights
  • Recent years have seen rapid growth in the area of machine learning
  • We study neural networks applied to image classification
  • All neural networks we study are feed-forward networks consisting of multiple layers F i taking as input the result of previous layers
  • In this paper we evaluate ten proposed defenses and demonstrate that none of them are able to withstand a white-box attack
  • We hope that our work will help raise the bar for evaluation of proposed defenses and perhaps help others to construct more effective defenses. Our evaluations of these defenses expand on what is believed to be possible with constructing adversarial examples: we have shown that, so far, there are no known intrinsic properties that differentiate adversarial examples from regular images
  • Gong et al achieve 98% accuracy in detecting adversarial examples
  • We believe that constructing defenses to adversarial examples is an important challenge that must be overcome before these networks are used in potentially security-critical domains, and hope our work can bring us closer towards this goal
Results
  • The authors' standard convolutional network achieves 99.4% accuracy on this dataset. The CIFAR-10 dataset [22] consists of 60, 000 32 ×32 color images of ten different objects (e.g., truck, airplane, etc).
  • The authors find that even the model that keeps the first 25 principal components is less robust than almost any standard, unsecured convolutional neural network; an unprotected network achieves both higher accuracy (99.5% accuracy) and better robustness to adversarial examples.
  • This effectively projects the image into a reduced-dimension manifold.
  • A 60% accuracy can either be very useful or entirely useless
Conclusion
  • Unlike standard machine-learning tasks, where achieving a higher accuracy on a single benchmark is in itself a useful and interesting result, this is not sufficient for secure machine learning.
  • In this paper the authors evaluate ten proposed defenses and demonstrate that none of them are able to withstand a white-box attack
  • The authors do this by constructing defense-specific loss functions that the authors minimize with a strong iterative attack algorithm.
  • With these attacks, on CIFAR an adversary can create imperceptible adversarial examples for each defense.
  • The authors believe that constructing defenses to adversarial examples is an important challenge that must be overcome before these networks are used in potentially security-critical domains, and hope the work can bring them closer towards this goal
Funding
  • This work was supported by the AFOSR under MURI award FA9550-12-10040, Intel through the ISTC for Secure Computing, the Hewlett Foundation through the Center for Long-Term Cybersecurity, and Qualcomm
Study subjects and analysis
papers: 7
2.5 Defenses. In order to better understand what properties are intrinsic of adversarial examples and what properties are only artificially true because of existing attack techniques, we choose the first seven papers released that construct defenses to detect adversarial examples. Three of the defenses [12, 15, 18] use a second neural network to classify images as natural or adversarial

Reference
  • Marco Barreno, Blaine Nelson, Anthony D Joseph, and JD Tygar. 2010. The security of machine learning. Machine Learning 81, 2 (2010), 121–148.
    Google ScholarLocate open access versionFindings
  • Marco Barreno, Blaine Nelson, Russell Sears, Anthony D Joseph, and J Doug Tygar. 2006. Can machine learning be secure?. In Proceedings of the 2006 ACM Symposium on Information, computer and communications security. ACM, 16–25.
    Google ScholarLocate open access versionFindings
  • Osbert Bastani, Yani Ioannou, Leonidas Lampropoulos, Dimitrios Vytiniotis, Aditya Nori, and Antonio Criminisi. 2016. Measuring neural net robustness with constraints. In Advances In Neural Information Processing Systems. 2613–2621.
    Google ScholarLocate open access versionFindings
  • Arjun Nitin Bhagoji, Daniel Cullina, and Prateek Mittal. 2017. Dimensionality Reduction as a Defense against Evasion Attacks on Machine Learning Classifiers. arXiv preprint arXiv:1704:02654 (2017).
    Google ScholarFindings
  • Battista Biggio, Igino Corona, Davide Maiorca, Blaine Nelson, Nedim Šrndić, Pavel Laskov, Giorgio Giacinto, and Fabio Roli. 2013. Evasion attacks against machine learning at test time. In Joint European Conference on Machine Learning and Knowledge Discovery in Databases. Springer, 387–402.
    Google ScholarLocate open access versionFindings
  • Mariusz Bojarski, Davide Del Testa, Daniel Dworakowski, Bernhard Firner, Beat Flepp, Prasoon Goyal, Lawrence D Jackel, Mathew Monfort, Urs Muller, Jiakai Zhang, and others. 201End to End Learning for Self-Driving Cars. arXiv preprint arXiv:1604.07316 (2016).
    Findings
  • Karsten M Borgwardt, Arthur Gretton, Malte J Rasch, Hans-Peter Kriegel, Bernhard Schölkopf, and Alex J Smola. 2006. Integrating structured biological data by kernel maximum mean discrepancy. Bioinformatics 22, 14 (2006), e49–e57.
    Google ScholarLocate open access versionFindings
  • Nicholas Carlini and David Wagner. 2017. Towards evaluating the robustness of neural networks. IEEE Symposium on Security and Privacy (2017).
    Google ScholarLocate open access versionFindings
  • Nilesh Dalvi, Pedro Domingos, Sumit Sanghai, Deepak Verma, and others. 2004. Adversarial classification. In Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 99–108.
    Google ScholarLocate open access versionFindings
  • Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. 2009. Imagenet: A large-scale hierarchical image database. In Computer Vision and Pattern Recognition, 2009. CVPR 2009. IEEE Conference on. IEEE, 248–255.
    Google ScholarLocate open access versionFindings
  • Reuben Feinman, Ryan R Curtin, Saurabh Shintre, and Andrew B Gardner. 2017. Detecting Adversarial Samples from Artifacts. arXiv preprint arXiv:1703.00410 (2017).
    Findings
  • Zhitao Gong, Wenlu Wang, and Wei-Shinn Ku. 2017. Adversarial and Clean Data Are Not Twins. arXiv preprint arXiv:1704.04960 (2017).
    Findings
  • Ian J Goodfellow, Jonathon Shlens, and Christian Szegedy. 2014. Explaining and harnessing adversarial examples. arXiv preprint arXiv:1412.6572 (2014).
    Findings
  • Arthur Gretton, Karsten M Borgwardt, Malte J Rasch, Bernhard Schölkopf, and Alexander Smola. 2012. A kernel two-sample test. Journal of Machine Learning Research 13, Mar (2012), 723–773.
    Google ScholarLocate open access versionFindings
  • Kathrin Grosse, Praveen Manoharan, Nicolas Papernot, Michael Backes, and Patrick McDaniel. 2017. On the (Statistical) Detection of Adversarial Examples. arXiv preprint arXiv:1702.06280 (2017).
    Findings
  • Shixiang Gu and Luca Rigazio. 2014. Towards deep neural network architectures robust to adversarial examples. arXiv preprint arXiv:1412.5068 (2014).
    Findings
  • Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 770–778.
    Google ScholarLocate open access versionFindings
  • Jan Hendrik Metzen, Tim Genewein, Volker Fischer, and Bastian Bischoff. 2017. On Detecting Adversarial Perturbations. In International Conference on Learning Representations. arXiv preprint arXiv:1702.04267.
    Findings
  • Dan Hendrycks and Kevin Gimpel. 2017. Early Methods for Detecting Adversarial Images. In International Conference on Learning Representations (Workshop Track).
    Google ScholarLocate open access versionFindings
  • Ruitong Huang, Bing Xu, Dale Schuurmans, and Csaba Szepesvári. 2015. Learning with a strong adversary. CoRR, abs/1511.03034 (2015).
    Findings
  • Jonghoon Jin, Aysegul Dundar, and Eugenio Culurciello. 2015. Robust Convolutional Neural Networks under Adversarial Noise. arXiv preprint arXiv:1511.06306 (2015).
    Findings
  • Alex Krizhevsky and Geoffrey Hinton. 2009. Learning multiple layers of features from tiny images. (2009).
    Google ScholarFindings
  • Yann LeCun, Corinna Cortes, and Christopher JC Burges. 1998. The MNIST database of handwritten digits. (1998).
    Google ScholarFindings
  • Xin Li and Fuxin Li. 2016. Adversarial Examples Detection in Deep Networks with Convolutional Filter Statistics. arXiv preprint arXiv:1612.07767 (2016).
    Findings
  • Daniel Lowd and Christopher Meek. 2005. Adversarial learning. In Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining. ACM, 641–647.
    Google ScholarLocate open access versionFindings
  • Seyed-Mohsen Moosavi-Dezfooli, Alhussein Fawzi, and Pascal Frossard. 2016. Deepfool: a simple and accurate method to fool deep neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2574–2582.
    Google ScholarLocate open access versionFindings
  • Vinod Nair and Geoffrey E Hinton. 2010. Rectified linear units improve restricted boltzmann machines. In Proceedings of the 27th international conference on machine learning (ICML-10). 807–814.
    Google ScholarLocate open access versionFindings
  • Anders Odén and Hans Wedel. 1975. Arguments for Fisher’s permutation test. The Annals of Statistics (1975), 518–520.
    Google ScholarLocate open access versionFindings
  • Nicolas Papernot, Patrick McDaniel, and Ian Goodfellow. 2016. Transferability in machine learning: from phenomena to black-box attacks using adversarial samples. arXiv preprint arXiv:1605.07277 (2016).
    Findings
  • Nicolas Papernot, Patrick McDaniel, Somesh Jha, Matt Fredrikson, Z Berkay Celik, and Ananthram Swami. 2016. The limitations of deep learning in adversarial settings. In Security and Privacy (EuroS&P), 2016 IEEE European Symposium on. IEEE, 372–387.
    Google ScholarLocate open access versionFindings
  • Nicolas Papernot, Patrick McDaniel, Xi Wu, Somesh Jha, and Ananthram Swami. 2016. Distillation as a defense to adversarial perturbations against deep neural networks. IEEE Symposium on Security and Privacy (2016).
    Google ScholarLocate open access versionFindings
  • Slav Petrov. 2016. Announcing syntaxnet: The worldâĂŹs most accurate parser goes open source. Google Research Blog, May 12 (2016), 2016.
    Google ScholarFindings
  • Andras Rozsa, Ethan M Rudd, and Terrance E Boult. 2016. Adversarial diversity and hard positive generation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops. 25–32.
    Google ScholarLocate open access versionFindings
  • Uri Shaham, Yutaro Yamada, and Sahand Negahban. 2015. Understanding Adversarial Training: Increasing Local Stability of Neural Nets through Robust Optimization. arXiv preprint arXiv:1511.05432 (2015).
    Findings
  • David Silver, Aja Huang, Chris J Maddison, Arthur Guez, Laurent Sifre, George Van Den Driessche, Julian Schrittwieser, Ioannis Antonoglou, Veda Panneershelvam, Marc Lanctot, and others. 2016. Mastering the game of Go with deep neural networks and tree search. Nature 529, 7587 (2016), 484–489.
    Google ScholarLocate open access versionFindings
  • Jost Tobias Springenberg, Alexey Dosovitskiy, Thomas Brox, and Martin Riedmiller. 2015. Striving for simplicity: The all convolutional net. In International Conference on Learning Representations (Workshop Track).
    Google ScholarLocate open access versionFindings
  • Nitish Srivastava, Geoffrey E Hinton, Alex Krizhevsky, Ilya Sutskever, and Ruslan Salakhutdinov. 2014. Dropout: a simple way to prevent neural networks from overfitting. Journal of Machine Learning Research 15, 1 (2014), 1929–1958.
    Google ScholarLocate open access versionFindings
  • Christian Szegedy, Vincent Vanhoucke, Sergey Ioffe, Jon Shlens, and Zbigniew Wojna. 2016. Rethinking the inception architecture for computer vision. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2818–2826.
    Google ScholarLocate open access versionFindings
  • Christian Szegedy, Wojciech Zaremba, Ilya Sutskever, Joan Bruna, Dumitru Erhan, Ian Goodfellow, and Rob Fergus. 2014. Intriguing properties of neural networks. (2014).
    Google ScholarFindings
  • Yonghui Wu, Mike Schuster, Zhifeng Chen, Quoc V Le, Mohammad Norouzi, Wolfgang Macherey, Maxim Krikun, Yuan Cao, Qin Gao, Klaus Macherey, and others. 2016. Google’s neural machine translation system: Bridging the gap between human and machine translation. arXiv preprint arXiv:1609.08144 (2016).
    Findings
  • Stephan Zheng, Yang Song, Thomas Leung, and Ian Goodfellow. 2016. Improving the robustness of deep neural networks via stability training. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 4480–4488.
    Google ScholarLocate open access versionFindings
Your rating :
0

 

Tags
Comments
数据免责声明
页面数据均来自互联网公开来源、合作出版商和通过AI技术自动分析结果,我们不对页面数据的有效性、准确性、正确性、可靠性、完整性和及时性做出任何承诺和保证。若有疑问,可以通过电子邮件方式联系我们:report@aminer.cn
小科