ProvSec: Open Cybersecurity System Provenance Analysis Benchmark Dataset with Labels

Madhukar Shrestha,Yonghyun Kim, Jeehyun Oh,Junghwan (John) Rhee,Yung Ryn Choe,Fei Zuo, Myungah Park,Gang Qian

Int. J. Networked Distributed Comput.(2023)

引用 0|浏览1
暂无评分
摘要
System provenance forensic analysis has been studied by a large body of research work. This area needs fine granularity data such as system calls along with event fields to track the dependencies of events. While prior work on security datasets has been proposed, we found a useful dataset of realistic attacks and details that are needed for high-quality provenance tracking is lacking. We created a new dataset of eleven vulnerable cases for system forensic analysis. It includes the full details of system calls including syscall parameters. Realistic attack scenarios with real software vulnerabilities and exploits are used. For each case, we created two sets of benign and adversary scenarios which are manually labeled for supervised machine-learning analysis. In addition, we present an algorithm to improve the data quality in the system provenance forensic analysis. We demonstrate the details of the dataset events and dependency analysis of our dataset cases.
更多
查看译文
关键词
Provenance, Dataset, Attack, Backtracking
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要