Static Analysis for Malware Classification Using Machine and Deep Learning.

Marcelo Invert Palma Salas,Paulo Lício de Geus

2023 XLIX Latin American Computer Conference (CLEI)(2023)

引用 0|浏览0
暂无评分
摘要
Malware, or malicious software, is a general term to describe any program or code that can be harmful to systems. This hostile, intrusive, and intentionally harmful code makes use of a variety of techniques to protect and evade detection and removal through code obfuscation, polymorphism, metamorphism, encryption, encrypted communication, and more. Current state-of-the-art research focuses on the application of artificial intelligence techniques for the detection and classification of malware. In this context, this paper proposes a new malware classification through static analysis using seven machine learning algorithms (LightGBM, XGBoost, Logistic Regression, KNN, SVM, Naive Bayes, and Random Forest) and deep learning finetuning. These models make use of the SelectKBest technique within data engineering, allowing the selection of the 893 most relevant characteristics for the classification of 10868 malware in 9 families, reducing overfitting and training time. The results show that the application of Gradient Boosting algorithms such as LightGBM with hyperparameter optimization exceeds the reference results in competitions such as Kaggle, with a logarithmic loss 0.00118, an accuracy close to 100%, and prediction times less than 2.3ms. Fast enough to be applied to systems in real time to classify malware.
更多
查看译文
关键词
malware classification,static analysis,SelectKBest,lightgbm,machine learning,deep learning
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要