Data Mining Based Strategy for Detecting Malicious PDF Files

Samir G. Sayed, Mohmed Shawkey

2018 17th IEEE International Conference On Trust, Security And Privacy In Computing And Communications/ 12th IEEE International Conference On Big Data Science And Engineering (TrustCom/BigDataSE)(2018)

引用 8|浏览8
暂无评分
摘要
Portable Document Format (PDF) is one of the widely-accepted document format. However, it becomes one of the most attractive targets for exploitation by malware developers and vulnerability researchers. Malicious PDF files can be used in Advanced Persistent Threats (APTs) targeting individuals, governments, and financial sectors. The existing tools such as intrusion detection systems (IDSs) and antivirus packages are inefficient to mitigate this kind of attacks. This is because these techniques need regular updates with the new malicious PDF files which are increasing every day. In this paper, a new algorithm is presented for detecting malicious PDF files based on data mining techniques. The proposed algorithm consists of feature selection stage and classification stage. The feature selection stage is used to the select the optimum number of features extracted from the PDF file to achieve high detection rate and low false positive rate with small computational overhead. Experimental results show that the proposed algorithm can achieve 99.77% detection rate, 99.84% accuracy, and 0.05% false positive rate.
更多
查看译文
关键词
Malicious PDF detection, data mining, gravitational research algorithm (GSA)
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要