AI helps you reading Science

AI generates interpretation videos

AI extracts and analyses the key points of the paper to generate videos automatically


pub
Go Generating

AI Traceability

AI parses the academic lineage of this thesis


Master Reading Tree
Generate MRT

AI Insight

AI extracts a summary of this paper


Weibo:
By evaluating it on two popular deep learning based security applications, we show that the proposed method produces highly accurate explanations

LEMNA: Explaining Deep Learning based Security Applications.

ACM Conference on Computer and Communications Security, pp.364-379, (2018)

Cited by: 140|Views608
EI

Abstract

While deep learning has shown a great potential in various domains, the lack of transparency has limited its application in security or safety-critical areas. Existing research has attempted to develop explanation techniques to provide interpretable explanations for each classification decision. Unfortunately, current methods are optimize...More

Code:

Data:

0
Introduction
  • The authors evaluate the system using two popular deep learning applications in security.
  • Extensive evaluations show that.
  • LEMNA’s explanation has a much higher fidelity level compared to existing methods.
  • The authors demonstrate practical use cases of LEMNA to help machine learning developers to validate model behavior, troubleshoot classification errors, and automatically patch the errors of the target models
Highlights
  • We evaluate our system using two popular deep learning applications in security
  • In Section §5, we demonstrate the level of generalizability by applying LEMNA to security applications built on both Recurrent Neural Networks and MLP
  • This paper introduces LEMNA, a new method to derive high-fidelity explanations for individual classification results for security applications
  • LEMNA treats a target deep learning model as a blackbox and approximates its decision boundary through a mixture regression model enhanced by fused lasso
  • By evaluating it on two popular deep learning based security applications, we show that the proposed method produces highly accurate explanations
  • We demonstrate how machine learning developers and security analysts can benefit from LEMNA to better understand classifier behavior, troubleshoot misclassification errors, and even perform automated patches to enhance the original deep learning model
Methods
  • Blackbox methods such as

    LIME do not support RNN well either. Methods like LIME assume features are independent, but this assumption is violated by RNN which explicitly models the dependencies of sequential data.

    Supporting Locally Non-linear Decision Boundary.
  • Methods like LIME assume features are independent, but this assumption is violated by RNN which explicitly models the dependencies of sequential data.
  • Supporting Locally Non-linear Decision Boundary.
  • Most existing methods (e.g., LIME) assume the local linearity of the decision boundary.
  • When the local decision boundary is non-linear, which is true for most complex networks, those explanation methods would produce serious errors.
  • The typical sampling methods can hit the artificial data points beyond the linear region, making it difficult for a linear model to approximate the decision boundary near x.
  • Later in the experiments (§ 5), the authors confirm that a simple linear approximation will significantly degrade the explanation fidelity
Results
  • The authors' experiments show that LEMNA outperforms LIME and the random baseline by a significant margin across all fidelity metrics.
  • To demonstrate the effectiveness of patching, the authors perform the above procedure on all 5 classifiers.
  • Table 6 shows the classifier performance before and after the patching.
  • The authors' experiment shows that both false positives and false negatives can be reduced after retraining for all five classifiers.
  • These results demonstrate that by understanding the model behavior, the authors can identify the weaknesses of the model and enhance the model
Conclusion
  • LEMNA treats a target deep learning model as a blackbox and approximates its decision boundary through a mixture regression model enhanced by fused lasso.
  • By evaluating it on two popular deep learning based security applications, the authors show that the proposed method produces highly accurate explanations.
  • The authors demonstrate how machine learning developers and security analysts can benefit from LEMNA to better understand classifier behavior, troubleshoot misclassification errors, and even perform automated patches to enhance the original deep learning model
Tables
  • Table1: Design space of explainable machine learning for security applications ( =true; #=false; G
  • Table2: Classification accuracy of the trained classifiers
  • Table3: The Root Mean Square Error (RMSE) of local approximation. LEMNA is more accurate than LIME
  • Table4: Hyper-parameters sensitivity testing results
  • Table5: Case study for the binary analysis (15 cases). Our explanation method ranks features and marks the most important features as red , followed by orange , gold , yellow . We also translate the hex code to assembling code for the ease of understanding. Note that the F. start refers to the function start detected by the deep learning classifier. The function start is also marked by a black square in the hex sequence. *For false negatives under R.F.N., we present the real function start that the classifier failed to detect, and explain why the function start is missed
  • Table6: Classification result before and after patching. kn
  • Table7: The hyper-parameters of corresponding deep learning models. Here “model structure” depicts the number of layers in the model as well as the number of units in each layer. Note that for the four model in the function start identification application (i.e., O0-O3), we use the same set of hyper-parameters
  • Table8: Case study for PDF malware classification (4 cases). The feature 31 and 33 are related to “JavaScript Object Markers”
Download tables as Excel
Related work
  • Since most related works have been discussed in §2 and §3, we briefly discuss other related works here.

    Improving Machine Learning Robustness. A deep learning model can be deceived by an adversarial sample (i.e., a malicious input crafted to cause misclassification) [61]. To improve the model resistance, researchers have proposed various defense methods [9, 20, 36, 40, 67]. The most relevant work is adversarial training [20].

    Adversarial training seeks to add adversarial examples to the training dataset to retrain a more robust model. Various techniques are available to craft adversarial examples for adversarial training [11, 33, 42, 72]. A key difference between our patching method and the standard adversarial training is that our patching is based on the understanding of the errors. We try to avoid blindly retraining the model which may introduce new vulnerabilities.
Funding
  • We gratefully acknowledge the support of NVIDIA Corporation with the donation of the Tesla K40 GPU used for this research
  • This project was supported in part by NSF grants CNS-1718459, CNS-1750101 and CNS-1717028
  • Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of any funding agencies
Reference
  • 2014. Mimcus. https://github.com/srndic/mimicus. (2014).
    Findings
  • Daniel Arp, Michael Spreitzenbarth, Malte Hubner, Hugo Gascon, Konrad Rieck, and CERT Siemens. 2014. DREBIN: Effective and Explainable Detection of Android Malware in Your Pocket. In Proceedings of the 20th Network and Distributed
    Google ScholarLocate open access versionFindings
  • Sebastian Bach, Alexander Binder, Grégoire Montavon, Frederick Klauschen, Klaus-Robert Müller, and Wojciech Samek. 2015. On pixel-wise explanations for non-linear classifier decisions by layer-wise relevance propagation. PloS one
    Google ScholarLocate open access versionFindings
  • Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. 201Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473 (2014).
    Findings
  • Tiffany Bao, Johnathon Burket, Maverick Woo, Rafael Turner, and David Brumley.
    Google ScholarFindings
  • 2014. Byteweight: Learning to recognize functions in binary code. In Proceedings of the 23rd USENIX Security Symposium (USENIX Security).
    Google ScholarLocate open access versionFindings
  • [6] Osbert Bastani, Carolyn Kim, and Hamsa Bastani. 201Interpreting blackbox models via model extraction. arXiv preprint arXiv:1705.08504 (2017).
    Findings
  • [7] Konstantin Berlin, David Slater, and Joshua Saxe. 2015. Malicious behavior detection using windows audit logs. In Proceedings of the 8th Workshop on Artificial
    Google ScholarLocate open access versionFindings
  • [8] Arjun Nitin Bhagoji, Daniel Cullina, and Prateek Mittal. 2017. Dimensionality reduction as a defense against evasion attacks on machine learning classifiers. arXiv preprint arXiv:1704.02654 (2017).
    Findings
  • [9] Xiaoyu Cao and Neil Zhenqiang Gong. 2017. Mitigating evasion attacks to deep neural networks via region-based classification. In Proceedings of the 33rd Annual
    Google ScholarLocate open access versionFindings
  • [10] Yinzhi Cao and Junfeng Yang. 2015. Towards making systems forget with machine unlearning. In Proceedings of the 36th IEEE Symposium on Security and Privacy (S&P).
    Google ScholarLocate open access versionFindings
  • [11] Nicholas Carlini and David Wagner. 2017. Towards evaluating the robustness of neural networks. In Proceedings of the 38th IEEE Symposium on Security and
    Google ScholarLocate open access versionFindings
  • [12] Gert Cauwenberghs and Tomaso Poggio. 2000. Incremental and decremental support vector machine learning. In Proceedings of the 13th Conference on Neural
    Google ScholarLocate open access versionFindings
  • [13] Peng Chen and Hao Chen. 2018. Angora: Efficient Fuzzing by Principled Search. In Proceedings of the 39th IEEE Symposium on Security and Privacy (S&P).
    Google ScholarLocate open access versionFindings
  • [14] François Chollet et al. 2017. Keras. (2017).
    Google ScholarFindings
  • [42] Kexin Pei, Yinzhi Cao, Junfeng Yang, and Suman Jana. 2017. Deepxplore: Automated whitebox testing of deep learning systems. In Proceedings of the 26th
    Google ScholarLocate open access versionFindings
  • [43] Paradyn Project. 2016. Dyninst: An application program interface (api) for runtime code generation. Online, http://www.dyninst.org (2016).
    Locate open access versionFindings
  • [44] Sarunas J. Raudys and Anil K. Jain. 1991. Small Sample Size Effects in Statistical Pattern Analysis and Machine Intelligence (1991).
    Google ScholarLocate open access versionFindings
  • [45] Marco Tulio Ribeiro, Sameer Singh, and Carlos Guestrin. 2016. Why should i trust you?: Explaining the predictions of any classifier. In Proceedings of the 22nd
    Google ScholarLocate open access versionFindings
  • [46] Enrique Romero, Ignacio Barrio, and Lluís Belanche. 2007. Incremental and decremental learning for linear support vector machines. In Proceedings of the
    Google ScholarLocate open access versionFindings
  • [47] Sherif Saad, Issa Traore, Ali Ghorbani, Bassam Sayed, David Zhao, Wei Lu, John Felix, and Payman Hakimian. 2011. Detecting P2P botnets through network behavior analysis and machine learning. In Proceedings of the 9th International
    Google ScholarLocate open access versionFindings
  • [48] Joshua Saxe and Konstantin Berlin. 2015. Deep neural network based malware detection using two dimensional binary program features. In Proceedings of the
    Google ScholarLocate open access versionFindings
  • [49] Henry Scheffe. 1947. The relation of control charts to analysis of variance and chi-square tests. J. Amer. Statist. Assoc. (1947).
    Google ScholarLocate open access versionFindings
  • [50] Ramprasaath R Selvaraju, Michael Cogswell, Abhishek Das, Ramakrishna Vedantam, Devi Parikh, and Dhruv Batra. 2016. Grad-cam: Visual explanations from deep networks via gradient-based localization. arxiv. org/abs/1610.02391 v3 (2016).
    Google ScholarLocate open access versionFindings
  • [51] Monirul Sharif, Andrea Lanzi, Jonathon Giffin, and Wenke Lee. 2009. Automatic reverse engineering of malware emulators. In Proceedings of the 30th IEEE
    Google ScholarLocate open access versionFindings
  • [52] Eui Chul Richard Shin, Dawn Song, and Reza Moazzezi. 2015. Recognizing Functions in Binaries with Neural Networks. In Proceedings of the 24th USENIX
    Google ScholarLocate open access versionFindings
  • [53] Avanti Shrikumar, Peyton Greenside, and Anshul Kundaje. 2017. Learning Important Features Through Propagating Activation Differences. In Proceedings of the 34th International Conference on Machine Learning (ICML).
    Google ScholarLocate open access versionFindings
  • [54] Karen Simonyan, Andrea Vedaldi, and Andrew Zisserman. 2013. Deep inside convolutional networks: Visualising image classification models and saliency maps. arXiv preprint arXiv:1312.6034 (2013).
    Findings
  • [55] Charles Smutz and Angelos Stavrou. 2012. Malicious PDF detection using metadata and structural features. In Proceedings of the 28th Annual Computer Security
    Google ScholarLocate open access versionFindings
  • [56] Dawn Song, David Brumley, Heng Yin, Juan Caballero, Ivan Jager, Min Gyung
    Google ScholarFindings
  • 2008. BitBlaze: A new approach to computer security via binary analysis. In Proceedings of 4th International Conference on Information Systems Security (ICISS).
    Google ScholarLocate open access versionFindings
  • [57] Jost Tobias Springenberg, Alexey Dosovitskiy, Thomas Brox, and Martin Riedmiller. 2014. Striving for simplicity: The all convolutional net. arXiv preprint arXiv:1412.6806 (2014).
    Findings
  • [58] Nedim Srndic and Pavel Laskov. 2014. Practical evasion of a learning-based classifier: A case study. In Proceedings of the 35th IEEE Symposium on Security and Privacy (S&P).
    Google ScholarLocate open access versionFindings
  • [59] Mukund Sundararajan, Ankur Taly, and Qiqi Yan. 2016. Gradients of counterfactuals. arXiv preprint arXiv:1611.02639 (2016).
    Findings
  • [60] Ilya Sutskever, Oriol Vinyals, and Quoc V Le. 2014. Sequence to sequence learning with neural networks. In Proceedings of the 27th Conference on Neural Information
    Google ScholarLocate open access versionFindings
  • [61] Christian Szegedy, Wojciech Zaremba, Ilya Sutskever, Joan Bruna, Dumitru Erhan, Ian Goodfellow, and Rob Fergus. 2013. Intriguing properties of neural networks. arXiv preprint arXiv:1312.6199 (2013).
    Findings
  • [62] Tuan A Tang, Lotfi Mhamdi, Des McLernon, Syed Ali Raza Zaidi, and Mounir Ghogho. 2016. Deep learning approach for network intrusion detection in software defined networking. In Proceedings of the 12th International Conference on
    Google ScholarLocate open access versionFindings
  • [63] Theano Development Team. 2016. Theano: A Python framework for fast computation of mathematical expressions. arxiv.org/abs/1605.02688 (2016).
    Findings
  • [64] Robert Tibshirani, Michael Saunders, Saharon Rosset, Ji Zhu, and Keith Knight.
    Google ScholarFindings
  • 2005. Sparsity and smoothness via the fused lasso. Journal of the Royal Statistical Society: Series B (Statistical Methodology) (2005).
    Google ScholarLocate open access versionFindings
  • [65] Cheng-Hao Tsai, Chieh-Yen Lin, and Chih-Jen Lin. 2014. Incremental and decremental training for linear classification. In Proceedings of the 20th International
    Google ScholarLocate open access versionFindings
  • [66] Grigorios Tzortzis and Aristidis Likas. 2007. Deep belief networks for spam filtering. In Proceedings of the 19th International Conference on Tools with Artificial
    Google ScholarLocate open access versionFindings
  • [67] Qinglong Wang, Wenbo Guo, Kaixuan Zhang, II Ororbia, G Alexander, Xinyu Xing, Xue Liu, and C Lee Giles. 2016. Learning adversary-resistant deep neural networks. arXiv preprint arXiv:1612.01401 (2016).
    Findings
  • [68] Qinglong Wang, Wenbo Guo, Kaixuan Zhang, Alexander G Ororbia II, Xinyu Xing, Xue Liu, and C Lee Giles. 2017. Adversary resistant deep neural networks
    Google ScholarFindings
  • with an application to malware detection. In Proceedings of the 23rd International
    Google ScholarLocate open access versionFindings
  • Han Xiao, Kashif Rasul, and Roland Vollgraf. 2017. Fashion-mnist: a novel image dataset for benchmarking machine learning algorithms. arXiv preprint arXiv:1708.07747 (2017).
    Findings
  • Cihang Xie, Jianyu Wang, Zhishuai Zhang, Zhou Ren, and Alan Yuille. 2017.
    Google ScholarFindings
  • Mitigating adversarial effects through randomization. In Proceedings of the 6th
    Google ScholarLocate open access versionFindings
  • Xiaojun Xu, Chang Liu, Qian Feng, Heng Yin, Le Song, and Dawn Song. 2017. Neural Network-based Graph Embedding for Cross-Platform Binary Code Similarity
    Google ScholarFindings
  • Detection. In Proceedings of the 24th Conference on Computer and Communications
    Google ScholarLocate open access versionFindings
  • Zhaogui Xu, Shiqing Ma, Xiangyu Zhang, Shuofei Zhu, and Baowen Xu. 2018.
    Google ScholarFindings
  • Trojanning Attack on Neural Networks. In Proceedings of the 25th Network and
    Google ScholarLocate open access versionFindings
  • Michal Zalewski. 2007. American fuzzy lop. (2007).
    Google ScholarFindings
  • Matthew D Zeiler and Rob Fergus. 2014. Visualizing and understanding convolutional networks. In Proceedings of the 13th European Conference on Computer
    Google ScholarLocate open access versionFindings
  • Mingwei Zhang and R. Sekar. 2013. Control Flow Integrity for COTS Binaries. In Proceedings of the 22nd USENIX Conference on Security (USENIX Security).
    Google ScholarLocate open access versionFindings
  • Luisa M Zintgraf, Taco S Cohen, Tameem Adel, and Max Welling. 2017. Visualizing deep neural network decisions: Prediction difference analysis. In Proceedings of the 5th International Conference on Learning Representations (ICLR).
    Google ScholarLocate open access versionFindings
Your rating :
0

 

Tags
Comments
数据免责声明
页面数据均来自互联网公开来源、合作出版商和通过AI技术自动分析结果,我们不对页面数据的有效性、准确性、正确性、可靠性、完整性和及时性做出任何承诺和保证。若有疑问,可以通过电子邮件方式联系我们:report@aminer.cn
小科