## AI helps you reading Science

## AI Insight

AI extracts a summary of this paper

Weibo:

# Understanding and Mitigating the Tradeoff Between Robustness and Accuracy

ICML, pp.7909-7919, (2020)

EI

Abstract

Adversarial training augments the training set with perturbations to improve the robust error (over worst-case perturbations), but it often leads to an increase in the standard error (on unperturbed test inputs). Previous explanations for this tradeoff rely on the assumption that no predictor in the hypothesis class has low standard and...More

Code:

Data:

Introduction

- Adversarial training methods (Goodfellow et al, 2015; Madry et al, 2017) attempt to improve the robustness of neural networks against adversarial examples (Szegedy et al, 2014) by augmenting the training set with perturbed examples that preserve the label but that fool the current model
- While such methods decrease the robust error, the error on worst-case perturbed inputs, they have been observed to cause an undesirable increase in the standard error, the error on unperturbed inputs (Madry et al, 2018; Zhang et al, 2019; Tsipras et al, 2019).
- On CIFAR-10, the authors find that the gap between the standard error of adversarial training and standard training decreases as the authors increase the labeled data size, thereby suggesting the tradeoff could disappear with infinite data (See Figure 1)

Highlights

- Adversarial training methods (Goodfellow et al, 2015; Madry et al, 2017) attempt to improve the robustness of neural networks against adversarial examples (Szegedy et al, 2014) by augmenting the training set with perturbed examples that preserve the label but that fool the current model
- In Section 4.2, we prove that robust self-training eliminates the tradeoff for linear regression—robust self-training does not increase standard error compared to the standard estimator while simultaneously achieving the best possible robust error, matching the standard error (see Figure 2(c) for the effect of robust self-training on the spline problem)
- As previous works only focus on the empirical evaluation of the gains in robustness via robust self-training, we systematically evaluate the effect of robust self-training on both the standard and robust error on CIFAR-10 when using unlabeled data from Tiny Images as sourced in Carmon et al (2019)
- We focus on interpolating estimators in highly overparameterized models, motivated by modern machine learning models that achieve near zero training loss
- In Section 4.2, we prove that in linear regression, robust self-training eliminates the tradeoff between standard and robust error (Theorem 2)
- We prove that robust self-training eliminates the decrease in standard error in this setting while achieving low robust error by showing that robust self-training appropriately regularizes the augmented estimator towards the standard estimator

Methods

**Complementary methods for robustness and accuracy**

In Table 1, the authors report the standard and robust errors of other methods that improve the tradeoff between standard and robust error.- RST, IAT and NAS are incomparable as they find different tradeoffs between standard and robust error.
- The authors believe that since RST provides a complementary statistical perspective on the tradeoff, it can be combined with methods like IAT or NAS to see further gains.
- The authors leave this to future work

Conclusion

- The authors studied the commonly observed increase in standard error upon adversarial training taking generalization from finite data into account.
- The authors' analysis reveals that the interplay between the inductive bias of models and the underlying geometry of the inputs causes the standard error to increase even when the augmented data is perfectly labeled
- This insight provides a method that provably eliminates the increase in standard error upon augmentation in linear regression by incorporating an appropriate regularizer based on the geometry of the inputs.
- How to best utilize unlabeled data, and whether sufficient unlabeled data would completely eliminate the tradeoff remain open questions

- Table1: Performance of robust self-training (RST) applied to different perturbations and adversarial training algorithms. (Left) CIFAR10 standard and robust test accuracy against ∞ perturbations of size = 8/255. All methods use = 8/255 while training and use the WRN-28-10 model. Robust accuracies are against a PG based attack with 20 steps. (Right) CIFAR-10 standard and robust test accuracy against a grid attack of rotations up to 30 degrees and translations up to ∼ 10% of the image size, following (<a class="ref-link" id="cEngstrom_et+al_2019_a" href="#rEngstrom_et+al_2019_a">Engstrom et al, 2019</a>). All adversarial and random methods use the same parameters during training and use the WRN-40-2 model. For both tables, shaded rows make use of 500K unlabeled images from 80M Tiny Images sourced in (<a class="ref-link" id="cCarmon_et+al_2019_a" href="#rCarmon_et+al_2019_a">Carmon et al, 2019</a>). RST improves both the standard and robust accuracy over the vanilla counterparts for different algorithms (AT and TRADES) and different perturbations ( ∞ and rotation/translations). right) presents the results. In the regime where vanilla robust training does not hurt standard error, RST in fact further improves the standard error by almost 1% and the robust error by 2-3% over the standard and robust estimators for both forms of robust training. Thus in settings where vanilla robust training improves standard error, RST seems to further amplify the gains while in settings where vanilla robust training hurts standard error, RST mitigates the harmful effect
- Table2: Test accuracies for the standard, vanilla adversarial training (AT), and AT with RST for = 1/255 on the full CIFAR-10 dataset. Accuracies are averaged over two trials. The robust accuracy of the standard model is near 0%
- Table3: Test accuracies for the standard, vanilla adversarial training (AT), and AT with RST for = 2/255 on the full CIFAR-10 dataset. Accuracies are averaged over two trials. The robust test accuracy of the standard model is near 0%

Reference

- Alzantot, M., Sharma, Y., Elgohary, A., Ho, B., Srivastava, M., and Chang, K. Generating natural language adversarial examples. In Empirical Methods in Natural Language Processing (EMNLP), 2018.
- Bartlett, P. L., Long, P. M., Lugosi, G., and Tsigler, A. Benign overfitting in linear regression. arXiv, 2019.
- Belkin, M., Ma, S., and Mandal, S. To understand deep learning we need to understand kernel learning. In International Conference on Machine Learning (ICML), 2018.
- Complementary methods for robustness and accuracy. In Table 1, we also report the standard and robust errors of other methods that improve the tradeoff between standard and robust error. Interpolated Adversarial Training (IAT) (Lamb et al., 2019) considers a different training algorithm based on Mixup, and Neural Architecture Search (NAS) (Cubuk et al., 2017) uses RL to search for more robust architectures. RST, IAT and NAS are incomparable as they find different tradeoffs between standard and robust error. However, we believe that since RST provides a complementary statistical perspective on the tradeoff, it can be combined with methods like IAT or NAS to see further gains. We leave this to future work.
- Carmon, Y., Raghunathan, A., Schmidt, L., Liang, P., and Duchi, J. C. Unlabeled data improves adversarial robustness. In Advances in Neural Information Processing Systems (NeurIPS), 2019.
- Cubuk, E. D., Zoph, B., Schoenholz, S. S., and Le, Q. V. Intriguing properties of adversarial examples. arXiv preprint arXiv:1711.02846, 2017.
- Diamond, S. and Boyd, S. CVXPY: A Python-embedded modeling language for convex optimization. Journal of Machine Learning Research (JMLR), 17(83):1–5, 2016.
- Engstrom, L., Tran, B., Tsipras, D., Schmidt, L., and Madry, A. Exploring the landscape of spatial robustness. In International Conference on Machine Learning (ICML), pp. 1802–1811, 2019.
- Fawzi, A., Fawzi, O., and Frossard, P. Analysis of classifiers’ robustness to adversarial perturbations. Machine Learning, 107(3):481–508, 2018.
- Friedman, J., Hastie, T., and Tibshirani, R. The elements of statistical learning, volume 1. Springer series in statistics New York, NY, USA: Springer series in statistics New York, NY, USA:, 2001.
- Goodfellow, I. J., Shlens, J., and Szegedy, C. Explaining and harnessing adversarial examples. In International Conference on Learning Representations (ICLR), 2015.
- Hastie, T., Montanari, A., Rosset, S., and Tibshirani, R. J. Surprises in high-dimensional ridgeless least squares interpolation. arXiv preprint arXiv:1903.08560, 2019.
- Jia, R. and Liang, P. Adversarial examples for evaluating reading comprehension systems. In Empirical Methods in Natural Language Processing (EMNLP), 2017.
- Krizhevsky, A., Sutskever, I., and Hinton, G. E. Imagenet classification with deep convolutional neural networks. In Advances in Neural Information Processing Systems (NeurIPS), pp. 1097–1105, 2012.
- Laine, S. and Aila, T. Temporal ensembling for semisupervised learning. In International Conference on Learning Representations (ICLR), 2017.
- Lamb, A., Verma, V., Kannala, J., and Bengio, Y. Interpolated adversarial training: Achieving robust neural networks without sacrificing too much accuracy. arXiv, 2019.
- Liang, T. and Rakhlin, A. Just interpolate: Kernel” ridgeless” regression can generalize. arXiv preprint arXiv:1808.00387, 2018.
- Ma, S., Bassily, R., and Belkin, M. The power of interpolation: Understanding the effectiveness of SGD in modern over-parametrized learning. In International Conference on Machine Learning (ICML), 2018.
- Madry, A., Makelov, A., Schmidt, L., Tsipras, D., and Vladu, A. Towards deep learning models resistant to adversarial attacks (published at ICLR 2018). arXiv, 2017.
- Madry, A., Makelov, A., Schmidt, L., Tsipras, D., and Vladu, A. Towards deep learning models resistant to adversarial attacks. In International Conference on Learning Representations (ICLR), 2018.
- Miyato, T., Maeda, S., Ishii, S., and Koyama, M. Virtual adversarial training: a regularization method for supervised and semi-supervised learning. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2018.
- Najafi, A., Maeda, S., Koyama, M., and Miyato, T. Robustness to adversarial perturbations in learning from incomplete data. In Advances in Neural Information Processing Systems (NeurIPS), 2019.
- Nakkiran, P. Adversarial robustness may be at odds with simplicity. arXiv preprint arXiv:1901.00532, 2019.
- Tsipras, D., Santurkar, S., Engstrom, L., Turner, A., and Madry, A. Robustness may be at odds with accuracy. In International Conference on Learning Representations (ICLR), 2019.
- Uesato, J., Alayrac, J., Huang, P., Stanforth, R., Fawzi, A., and Kohli, P. Are labels required for improving adversarial robustness? In Advances in Neural Information Processing Systems (NeurIPS), 2019.
- Xie, Q., Dai, Z., Hovy, E., Luong, M., and Le, Q. V. Unsupervised data augmentation. arXiv preprint arXiv:1904.12848, 2019.
- Yaeger, L., Lyon, R., and Webb, B. Effective training of a neural network character classifier for word recognition. In Advances in Neural Information Processing Systems (NeurIPS), pp. 807–813, 1996.
- Yang, F., Wang, Z., and Heinze-Deml, C. Invarianceinducing regularization using worst-case transformations suffices to boost accuracy and spatial robustness. In Advances in Neural Information Processing Systems (NeurIPS), 2019.
- Yin, D., Lopes, R. G., Shlens, J., Cubuk, E. D., and Gilmer, J. A fourier perspective on model robustness in computer vision. arXiv preprint arXiv:1906.08988, 2019.
- Zagoruyko, S. and Komodakis, N. Wide residual networks. In British Machine Vision Conference, 2016.
- Zhang, C., Bengio, S., Hardt, M., Recht, B., and Vinyals, O. Understanding deep learning requires rethinking generalization. In International Conference on Learning Representations (ICLR), 2017.
- Zhang, H., Yu, Y., Jiao, J., Xing, E. P., Ghaoui, L. E., and Jordan, M. I. Theoretically principled trade-off between robustness and accuracy. In International Conference on Machine Learning (ICML), 2019.
- Sajjadi, M., Javanmardi, M., and Tasdizen, T. Regularization with stochastic transformations and perturbations for deep semi-supervised learning. In Advances in Neural Information Processing Systems (NeurIPS), pp. 1163–1171, 2016.
- Schmidt, L., Santurkar, S., Tsipras, D., Talwar, K., and Madry, A. Adversarially robust generalization requires more data. In Advances in Neural Information Processing Systems (NeurIPS), pp. 5014–5026, 2018.
- Szegedy, C., Zaremba, W., Sutskever, I., Bruna, J., Erhan, D., Goodfellow, I., and Fergus, R. Intriguing properties of neural networks. In International Conference on Learning Representations (ICLR), 2014.
- 1. When the population covariance Σ = I, from Theorem 1, we see that
- 2. When Π⊥aug = 0, the vector w in Theorem 1 is 0, and hence we get
- 3. We prove the eigenvector condition in Section B.7 which studies the effect of augmenting with a single extra point in general.
- 2. Thus when Π⊥stdθ is an eigenvector of Σ, there are no augmentations xext that increase the standard error.
- 1. Set t = 0. Initialize θt = θint-std and (Xext)0 as an empty matrix.
- 2. At iteration t, solve for xtext = arg maxxext∈T (xextθt)2. If the objective is unbounded, choose any xtext such that xextθt = 0.
- 3. If θt xtext = 0, stop and return (Xext)t.
- 4. Otherwise, add xtext as a row in (Xext)t. Increment t and let θt solve (31) with Xext = (Xext)t.
- 5. Return to step 2.
- 1. Self-training (pseudo-labeling): Classical self-training does not deal with data augmentation or robustness. We view RST as a a generalization of self-training in the context of data augmentations. Here the pseudolabels are generated by a standard non-augmented estimator that is not trained on the labeled augmented points. In contrast, standard
- 2. Robust consistency training: Another popular semi-supervised learning strategy is based on enforcing consistency in a model’s predictions across various perturbations of the unlabeled data (Miyato et al., 2018; Xie et al., 2019; Sajjadi et al., 2016; Laine & Aila, 2017)). RST is similar in spirit, but has an additional crucial component. We generate pseudo-labels first by performing standard training, and rather than enforcing simply consistency across perturbations, RST enforces that the unlabeled data and perturbations are matched with the pseudo-labels generated.

Tags

Comments

数据免责声明

页面数据均来自互联网公开来源、合作出版商和通过AI技术自动分析结果，我们不对页面数据的有效性、准确性、正确性、可靠性、完整性和及时性做出任何承诺和保证。若有疑问，可以通过电子邮件方式联系我们：report@aminer.cn