Evaluating Supervised Machine Learning Models for Zero-Day Phishing Attack Detection: A Comprehensive Study

Zahra Lotfi, Sara Valipourebrahimi,Thomas Tran

Research Square (Research Square)(2023)

引用 0|浏览0
暂无评分
摘要
Abstract To have highly secure e-commerce websites, detecting and preventing cyber-attacks is of high importance. Among diverse types of cyber-attacks, identifying zero-day attacks is problematic since they are unknown to the security system. It is because they usually are launched by an attacker and none of the existing defined patterns match with the unknown (malicious) case. There are many machine learning models developed to analyze and detect phishing websites, specifically using supervised models. However, the main issue with zero-day attacks is that they are not seen before, so their patterns are not trained to the model. Thus, the supervised models designed for detecting phishing URLs should be very accurate in predicting the label of unseen data. This research addresses the underlying issue by evaluating seven different supervised machine learning models to assess their accuracy in predicting zero-day phishing attacks. Unlike previous studies that examined models on features that are only extracted from URLs, our evaluation framework incorporates a comprehensive dataset that includes not only URL features but also third-party extracted features as well as content-based features. This research also examines the performance of the models under the impact of dimension reduction techniques. By reducing the dimensionality of the dataset, we aim to improve computational efficiency without compromising the accuracy of the models. The results depict that XGBoost performs best on zero-day attack data sets with accuracy and an f1-score of 96.6%, and PCA can be applied in high-dimensional data sets without adverse effects on the models’ performance.
更多
查看译文
关键词
supervised machine learning models,machine learning,zero-day
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要