Chaos Theory meets deep learning : On Lyapunov exponents and adversarial perturbations

semanticscholar(2018)

引用 0|浏览0
暂无评分
摘要
In this paper, we would like to disseminate a serendipitous discovery involving Lyapunov exponents of a 1-D time series and their use in serving as a filtering defense tool against a specific kind of deep adversarial perturbation. To this end, we use the state-of-the-art CleverHans library to generate adversarial perturbations against a standard Convolutional Neural Network (CNN) architecture trained on the MNIST dataset. We empirically demonstrate how the Lyapunov exponents computed on the flattened 1-D vector representations of the images served as highly discriminative features that could be to pre-classify images as adversarial or legitimate before feeding the image into the CNN for classification. 1. Background on defenses against adversarial attacks In the recent past, a plethora of defenses against adversarial attacks have been proposed. These include SafetyNet [12], adversarial training [19], label smoothing [20], defensive distillation [17] and feature-squeezing [21, 22] to name a few. There is also an ongoing Kaggle contest[2] underway for exploring novel defenses against adversarial attacks. As evinced by the recent spurt in the papers written on this topic, most defenses proposed are quelled by a novel attack that exploits some weakness in the defense. In [10], the authors queried if one could concoct a strong defense by combining multiple defenses and showed that an ensemble of weak defenses was not sufficient in providing strong defense against adversarial examples that they were able to craft. With this background, we shall now look more closely at a specific type of defense and motivate the relevance of our method within this framework. 1.1. The pre-detector based defenses One prominent approach that emerges in the literature of adversarial defenses is that of crafting pre-detection and filtering systems that flag inputs that might be potentially adversarial. In [9], the authors posit that adversarial examples are not drawn from the same distribution as the legitimate samples and can thus be detected using statistical tests. In [14, 7], the authors train a separate binary classifier to first classify any input image as legitimate or adversarial and then perform inference on the passed images. In approaches such as [6], the authors assume that DNNs classify accurately only near the small manifold of training data and that the synthetic adversarial samples do not lie on the data manifold. They apply dropout at test time to ascertain the confidence of adversariality of the input image. In this paper, we would like to disseminate a modelagnostic approach towards adversarial defense that is dependent purely on the quasi-time-series statistics of the input images that was discovered in a rather serendipitous fashion. The goal is to not present the method we propose as a fool-proof adversarial filter, but to instead draw the attention of the DNN and CV communities towards this chance discovery that we feel is worthy of further inquiry.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要