Android Malware Detection Using API Calls: A Comparison of Feature Selection and Machine Learning Models

PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON APPLIED CYBER SECURITY (ACS) 2021(2022)

引用 3|浏览6
暂无评分
摘要
Android has become a major target for malware attacks due its popularity and ease of distribution of applications. According to a recent study, around 11,000 new malware appear online on daily basis. Machine learning approaches have shown to perform well in detecting malware. In particular, API calls has been found to be one of the best performing features in malware detection. However, due to the functionalities provided by the Android SDK, applications can use many API calls, creating a computational overhead while training machine learning models. In this study, we look at the benefits of using feature selection to reduce this overhead. We consider three different feature selection algorithms, mutual information, variance threshold and Pearson correlation coefficient, when used with five different machine learning models: support vector machines, decision trees, random forests, Naive Bayes and AdaBoost. We collected a dataset of 40,000 Android applications that used 134,207 different API calls. Our results show that the number of API calls can be reduced by approximately 95%, whilst still being more accurate than when the full API feature set is used. Random forests achieve the best discrimination between malware and benign applications, with an accuracy of 96.1%.
更多
查看译文
关键词
api calls,feature selection,machine learning
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要