Using Self-Supervised Learning Can Improve Model Robustness and Uncertainty

Mantas Mazeika
Mantas Mazeika
Saurav Kadavath
Saurav Kadavath

ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), pp. 15637-15648, 2019.

Cited by: 92|Bibtex|Views88|Links
EI
Keywords:
supervised learningsemi-supervised learning
Weibo:
We found large improvements in robustness to adversarial examples, label corruption, and common input corruptions

Abstract:

Self-supervision provides effective representations for downstream tasks without requiring labels. However, existing approaches lag behind fully supervised training and are often not thought beneficial beyond obviating or reducing the need for annotations. We find that self-supervision can benefit robustness in a variety of ways, includin...More

Code:

Data:

0
Introduction
  • Self-supervised learning holds great promise for improving representations when labeled data are scarce.
  • The authors show that while self-supervision does not substantially improve accuracy when used in tandem with standard training on fully labeled datasets, it can improve several aspects of model robustness, including robustness to adversarial examples [Madry et al, 2018], label corruptions [Patrini et al, 2017, Zhang and Sabuncu, 2018], and common input corruptions such as fog, snow, and blur [Hendrycks and Dietterich, 2019]
  • These gains are masked if one looks at clean accuracy alone, for which performance stays constant.
  • Using self-supervised learning techniques on CIFAR-10 and ImageNet for out-of-distribution detection, the authors are even able to surpass fully supervised methods
Highlights
  • Self-supervised learning holds great promise for improving representations when labeled data are scarce
  • We show that while self-supervision does not substantially improve accuracy when used in tandem with standard training on fully labeled datasets, it can improve several aspects of model robustness, including robustness to adversarial examples [Madry et al, 2018], label corruptions [Patrini et al, 2017, Zhang and Sabuncu, 2018], and common input corruptions such as fog, snow, and blur [Hendrycks and Dietterich, 2019]
  • In order to demonstrate that our method does not rely on gradient obfuscation, we attempted to attack our models using SPSA [Uesato et al, 2018] and failed to notice any performance degradation compared to standard Projected Gradient Descent (PGD) training
  • The rotation method increases the area under the receiver operating characteristic curve (AUROC) by 4.8%
  • We applied self-supervised learning to improve the robustness and uncertainty of deep learning models beyond what was previously possible with purely supervised approaches
  • We found large improvements in robustness to adversarial examples, label corruption, and common input corruptions
Methods
  • Training without loss correction methods or self-supervision serves as the first baseline, which the authors call No Correction in Table 2.
  • The authors compare to the state-of-the-art Gold Loss Correction (GLC) Hendrycks et al [2018].
  • This is a two-stage loss correction method based on Sukhbaatar et al [2014] and Patrini et al [2017].
  • The authors specify the percent of amount of trusted data available in experiments as a fraction of the training set
  • This setup is known as a semi-verified setting Charikar et al [2017]
Results
  • Results and analysis

    The authors are able to attain large improvements over standard PGD training by adding self-supervised rotation prediction.
  • Performance is improved by 8.9% on contrast and shot noise and 4.2% on frost, indicating substantial gains in robustness on a wide variety of corruptions
  • These results demonstrate that self-supervision can regularize networks to be more robust even if clean accuracy is not affected.
  • The performance gains are of comparable magnitude to more complex methods proposed in the literature [Xie et al, 2018]
  • This demonstrates that self-supervised auxiliary rotation prediction can augment OOD detectors based on fully supervised multi-class representations.
Conclusion
  • The authors applied self-supervised learning to improve the robustness and uncertainty of deep learning models beyond what was previously possible with purely supervised approaches.
  • Self-supervision had the largest improvement over supervised techniques in the ImageNet experiments, where the larger input size meant that the authors were able to apply a more complex self-supervised objective.
  • The authors' results suggest that future work in building more robust models and better data representations could benefit greatly from self-supervised approaches
Summary
  • Introduction:

    Self-supervised learning holds great promise for improving representations when labeled data are scarce.
  • The authors show that while self-supervision does not substantially improve accuracy when used in tandem with standard training on fully labeled datasets, it can improve several aspects of model robustness, including robustness to adversarial examples [Madry et al, 2018], label corruptions [Patrini et al, 2017, Zhang and Sabuncu, 2018], and common input corruptions such as fog, snow, and blur [Hendrycks and Dietterich, 2019]
  • These gains are masked if one looks at clean accuracy alone, for which performance stays constant.
  • Using self-supervised learning techniques on CIFAR-10 and ImageNet for out-of-distribution detection, the authors are even able to surpass fully supervised methods
  • Methods:

    Training without loss correction methods or self-supervision serves as the first baseline, which the authors call No Correction in Table 2.
  • The authors compare to the state-of-the-art Gold Loss Correction (GLC) Hendrycks et al [2018].
  • This is a two-stage loss correction method based on Sukhbaatar et al [2014] and Patrini et al [2017].
  • The authors specify the percent of amount of trusted data available in experiments as a fraction of the training set
  • This setup is known as a semi-verified setting Charikar et al [2017]
  • Results:

    Results and analysis

    The authors are able to attain large improvements over standard PGD training by adding self-supervised rotation prediction.
  • Performance is improved by 8.9% on contrast and shot noise and 4.2% on frost, indicating substantial gains in robustness on a wide variety of corruptions
  • These results demonstrate that self-supervision can regularize networks to be more robust even if clean accuracy is not affected.
  • The performance gains are of comparable magnitude to more complex methods proposed in the literature [Xie et al, 2018]
  • This demonstrates that self-supervised auxiliary rotation prediction can augment OOD detectors based on fully supervised multi-class representations.
  • Conclusion:

    The authors applied self-supervised learning to improve the robustness and uncertainty of deep learning models beyond what was previously possible with purely supervised approaches.
  • Self-supervision had the largest improvement over supervised techniques in the ImageNet experiments, where the larger input size meant that the authors were able to apply a more complex self-supervised objective.
  • The authors' results suggest that future work in building more robust models and better data representations could benefit greatly from self-supervised approaches
Tables
  • Table1: Results for our defense. All results use ε = 8.0/255. For 20-step adversaries α = 2.0/255, and for 100-step adversaries α = 0.3/255. More steps do not change results, so the attacks converge. Self-supervision through rotations provides large gains over standard adversarial training
  • Table2: Label corruption results comparing normal training to training with auxiliary rotation selfsupervision. Each value is the average error over 11 corruption strengths. All values are percentages. The reliable training signal from self-supervision improves resistance to label noise
  • Table3: AUROC values of different OOD detectors trained on one of ten CIFAR-10 classes. Test time out-of-distribution examples are from the remaining nine CIFAR-10 classes. In-distribution examples are examples belonging to the row’s class. Our self-supervised technique surpasses a fully supervised model. All values are percentages
  • Table4: AUROC values of supervised and self-supervised OOD detectors. AUROC values are an average of 30 AUROCs corresponding to the 30 different models trained on exactly one of the 30 classes. Each model’s in-distribution examples are from one of 30 classes, and the test outof-distribution samples are from the remaining 29 classes. The self-supervised methods greatly outperform the supervised method. All values are percentages
Download tables as Excel
Related work
  • Self-supervised learning. A number of selfsupervised methods have been proposed, each exploring a different pretext task. Doersch et al.

    [2015] predict the relative position of image patches and use the resulting representation to improve object detection. Dosovitskiy et al.

    [2016] create surrogate classes to train on by transforming seed image patches. Similarly, Gidaris et al [2018] predict image rotations (Figure 1). Other approaches include using colorization as a proxy task [Larsson et al, 2016], deep clustering methods [Ji et al, 2018], and methods that maximize mutual information [Hjelm et al, 2019] with high-level representations [van den Oord et al, 2018, Hénaff et al, 2019]. These works focus on the utility of self-supervision for learning without labeled data and do not consider its effect on robustness and uncertainty.
Funding
  • This material is in part based upon work supported by the National Science Foundation Frontier Grant
Reference
  • Anish Athalye, Nicholas Carlini, and David Wagner. Obfuscated gradients give a false sense of security: Circumventing defenses to adversarial examples. In Proceedings of the 35th International Conference on Machine Learning, ICML 2018, July 2018.
    Google ScholarLocate open access versionFindings
  • Jens Behrmann, Will Grathwohl, Ricky T. Q. Chen, David Duvenaud, and Jörn-Henrik Jacobsen. Invertible residual networks. ArXiv, abs/1811.00995, 2018.
    Findings
  • Nicholas Carlini and David Wagner. Adversarial examples are not easily detected: Bypassing ten detection methods, 2017.
    Google ScholarFindings
  • Moses Charikar, Jacob Steinhardt, and Gregory Valiant. Learning from untrusted data. STOC, 2017.
    Google ScholarLocate open access versionFindings
  • Jesse Davis and Mark Goadrich. The relationship between precision-recall and ROC curves. In International Conference on Machine Learning, 2006.
    Google ScholarLocate open access versionFindings
  • Jia Deng, Wei Dong, Richard Socher, Li jia Li, Kai Li, and Li Fei-Fei. ImageNet: A large-scale hierarchical image database. CVPR, 2009.
    Google ScholarLocate open access versionFindings
  • Carl Doersch, Abhinav Gupta, and Alexei A Efros. Unsupervised visual representation learning by context prediction. In Proceedings of the IEEE International Conference on Computer Vision, pages 1422–1430, 2015.
    Google ScholarLocate open access versionFindings
  • Alexey Dosovitskiy, Philipp Fischer, Jost Tobias Springenberg, Martin Riedmiller, and Thomas Brox. Discriminative unsupervised feature learning with exemplar convolutional neural networks. IEEE transactions on pattern analysis and machine intelligence, 38(9):1734–1747, 2016.
    Google ScholarLocate open access versionFindings
  • Spyros Gidaris, Praveer Singh, and Nikos Komodakis. Unsupervised representation learning by predicting image rotations. In International Conference on Learning Representations, 2018.
    Google ScholarLocate open access versionFindings
  • Izhak Golan and Ran El-Yaniv. Deep anomaly detection using geometric transformations. CoRR, abs/1805.10917, 2018.
    Findings
  • Dan Hendrycks and Thomas Dietterich. Benchmarking neural network robustness to common corruptions and perturbations. ICLR, 2019.
    Google ScholarLocate open access versionFindings
  • Dan Hendrycks and Kevin Gimpel. A baseline for detecting misclassified and out-of-distribution examples in neural networks. ICLR, 2017.
    Google ScholarLocate open access versionFindings
  • Dan Hendrycks, Mantas Mazeika, Duncan Wilson, and Kevin Gimpel. Using trusted data to train deep networks on labels corrupted by severe noise. NeurIPS, 2018.
    Google ScholarLocate open access versionFindings
  • Dan Hendrycks, Kimin Lee, and Mantas Mazeika. Using pre-training can improve model robustness and uncertainty. Proceedings of the International Conference on Machine Learning, 2019a.
    Google ScholarLocate open access versionFindings
  • Dan Hendrycks, Mantas Mazeika, and Thomas Dietterich. Deep anomaly detection with outlier exposure. In International Conference on Learning Representations, 2019b.
    Google ScholarLocate open access versionFindings
  • R Devon Hjelm, Alex Fedorov, Samuel Lavoie-Marchildon, Karan Grewal, Phil Bachman, Adam Trischler, and Yoshua Bengio. Learning deep representations by mutual information estimation and maximization. In International Conference on Learning Representations, 2019.
    Google ScholarLocate open access versionFindings
  • Olivier J. Hénaff, Ali Razavi, Carl Doersch, S. M. Ali Eslami, and Aaron van den Oord. Data-efficient image recognition with contrastive predictive coding, 2019.
    Google ScholarFindings
  • Xu Ji, João F. Henriques, and Andrea Vedaldi. Invariant information distillation for unsupervised image segmentation and clustering. CoRR, abs/1807.06653, 2018.
    Findings
  • Alexey Kurakin, Ian Goodfellow, and Samy Bengio. Adversarial machine learning at scale. ICLR, 2017.
    Google ScholarLocate open access versionFindings
  • Kimin Lee, Honglak Lee, Kibok Lee, and Jinwoo Shin. Training confidence-calibrated classifiers for detecting out-of-distribution samples. ICLR, 2018.
    Google ScholarLocate open access versionFindings
  • Tsung-Yi Lin, Priya Goyal, Ross B. Girshick, Kaiming He, and Piotr Dollár. Focal loss for dense object detection. ICCV, 2017.
    Google ScholarLocate open access versionFindings
  • Ilya Loshchilov and Frank Hutter. SGDR: stochastic gradient descent with warm restarts. ICLR, 2016.
    Google ScholarLocate open access versionFindings
  • Aleksander Madry, Aleksandar Makelov, Ludwig Schmidt, Dimitris Tsipras, and Adrian Vladu. Towards deep learning models resistant to adversarial attacks. ICLR, 2018.
    Google ScholarLocate open access versionFindings
  • David F Nettleton, Albert Orriols-Puig, and Albert Fornells. A study of the effect of different types of noise on the precision of supervised learning techniques. Artif Intell Rev, 2010.
    Google ScholarLocate open access versionFindings
  • Giorgio Patrini, Alessandro Rozza, Aditya Menon, Richard Nock, and Lizhen Qu. Making deep neural networks robust to label noise: a loss correction approach. CVPR, 2017.
    Google ScholarLocate open access versionFindings
  • Lukas Ruff, Robert A. Vandermeulen, Nico Görnitz, Lucas Deecke, Shoaib A. Siddiqui, Alexander Binder, Emmanuel Müller, and Marius Kloft. Deep one-class classification. In Proceedings of the 35th International Conference on Machine Learning, volume 80, pages 4393–4402, 2018.
    Google ScholarLocate open access versionFindings
  • Ludwig Schmidt, Shibani Santurkar, Dimitris Tsipras, Kunal Talwar, and Aleksander Madry. Adversarially robust generalization requires more data. NeurIPS, 2018.
    Google ScholarLocate open access versionFindings
  • Bernhard Schölkopf, Robert Williamson, Alex Smola, John Shawe-Taylor, and John Platt. Support vector method for novelty detection. In Proceedings of the 12th International Conference on Neural Information Processing Systems, NIPS’99, pages 582–588, Cambridge, MA, USA, 1999. MIT Press.
    Google ScholarLocate open access versionFindings
  • Nitish Srivastava, Geoffrey Hinton, Alex Krizhevsky, Ilya Sutskever, and Ruslan Salakhutdinov. Dropout: A simple way to prevent neural networks from overfitting. The Journal of Machine Learning Research, 2014.
    Google ScholarLocate open access versionFindings
  • Sainbayar Sukhbaatar, Joan Bruna, Manohar Paluri, Lubomir Bourdev, and Rob Fergus. Training convolutional networks with noisy labels. ICLR Workshop, 2014.
    Google ScholarFindings
  • Antonio Torralba, Rob Fergus, and William T Freeman. 80 million tiny images: A large data set for nonparametric object and scene recognition. Pattern Analysis and Machine Intelligence, 2008.
    Google ScholarLocate open access versionFindings
  • Jonathan Uesato, Brendan O’Donoghue, Aaron van den Oord, and Pushmeet Kohli. Adversarial risk and the dangers of evaluating against weak attacks. arXiv preprint arXiv:1802.05666, 2018.
    Findings
  • Aaron van den Oord, Yazhe Li, and Oriol Vinyals. Representation learning with contrastive predictive coding. NeurIPS, 2018.
    Google ScholarLocate open access versionFindings
  • Carl Vondrick, Hamed Pirsiavash, and Antonio Torralba. Anticipating visual representations from unlabeled video. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Jun 2016. doi: 10.1109/cvpr.2016.18.
    Locate open access versionFindings
  • Carl Vondrick, Abhinav Shrivastava, Alireza Fathi, Sergio Guadarrama, and Kevin Murphy. Tracking emerges by colorizing videos. In The European Conference on Computer Vision (ECCV), September 2018.
    Google ScholarLocate open access versionFindings
  • Sanghyun Woo, Jongchan Park, Joon-Young Lee, and In So Kweon. Cbam: Convolutional block attention module. In The European Conference on Computer Vision (ECCV), September 2018.
    Google ScholarLocate open access versionFindings
  • Cihang Xie, Yuxin Wu, Laurens van der Maaten, Alan Yuille, and Kaiming He. Feature denoising for improving adversarial robustness. arXiv preprint, 2018.
    Google ScholarFindings
  • Fisher Yu, Yinda Zhang, Shuran Song, Ari Seff, and Jianxiong Xiao. LSUN: construction of a large-scale image dataset using deep learning with humans in the loop. CoRR, abs/1506.03365, 2015.
    Findings
  • Sergey Zagoruyko and Nikos Komodakis. Wide residual networks. BMVC, 2016. Xiaohua Zhai, Avital Oliver, Alexander Kolesnikov, and Lucas Beyer. S4l: Self-supervised semisupervised learning, 2019. Hongyang Zhang, Yaodong Yu, Jiantao Jiao, Eric P. Xing, Laurent El Ghaoui, and Michael I.
    Google ScholarLocate open access versionFindings
  • Jordan. Theoretically principled trade-off between robustness and accuracy. arXiv preprint arXiv:1901.08573, 2019. Zhilu Zhang and Mert Sabuncu. Generalized cross entropy loss for training deep neural networks with noisy labels. In Advances in Neural Information Processing Systems, pages 8778–8788, 2018.
    Findings
Your rating :
0

 

Tags
Comments