Characterization and classification of malicious Web traffic.

Computers & Security(2014)

引用 58|浏览75
暂无评分
摘要
Web systems commonly face unique set of vulnerabilities and security threats due to their high exposure, access by browsers, and integration with databases. This study is focused on characterization and classification of malicious cyber activities aimed at Web systems. The empirical analysis is based on three datasets, each in duration of four to five months, collected by high-interaction honeypots which ran fully functional three-tier Web systems. We first explore the types and prevalence of malicious scans and attacks to Web systems, and the extent to which these malicious activities differ in different periods of time or on Web servers running different services. In addition to descriptive statistical analysis, we include an inferential statistical analysis of the malicious session attributes, such as duration, number of requests and bytes transferred in a session. Then, we use supervised machine learning methods to classify attacker activities to two classes: vulnerability scans and attacks. Our main observations include the following: (1) Some characteristics of the malicious Web traffic were invariant across different servers and time periods, such as for example the dominant use of the search-based strategy for attacking the servers and the heavy-tailed behavior of session attributes. (2) On the other side, servers running different services experienced almost complementary profiles of vulnerability scan and attack types. (3) Supervised learning methods efficiently distinguished attack sessions from vulnerability scan sessions, with high probability of detection and very low probability of false alarms. (4) Decision tree based methods J48 and PART performed better than SVM across all datasets. (5) Attacks differed from vulnerability scans only in a small number of session attributes; depending on the dataset, classification of malicious activities can be performed using from four to six features without significantly affecting learners' performance compared to when all 43 features were used.
更多
查看译文
关键词
Web security,Empirical study,Malicious web sessions,Vulnerability scans,Attacks,Statistical inference,Classification
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要