Security Issue Classification for Vulnerability Management with Semi-supervised Learning

Emil Wareus, Anton Duppils, Magnus Tullberg,Martin Hell

PROCEEDINGS OF THE 8TH INTERNATIONAL CONFERENCE ON INFORMATION SYSTEMS SECURITY AND PRIVACY (ICISSP)(2021)

引用 0|浏览9
暂无评分
摘要
Open-Source Software (OSS) is increasingly common in industry software and enables developers to build better applications, at a higher pace, and with better security. These advantages also come with the cost of including vulnerabilities through these third-party libraries. The largest publicly available database of easily machine-readable vulnerabilities is the National Vulnerability Database (NVD). However, reporting to this database is a human-dependent process, and it fails to provide an acceptable coverage of all open source vulnerabilities. We propose the use of semi-supervised machine learning to classify issues as security-related to provide additional vulnerabilities in an automated pipeline. Our models, based on a Hierarchical Attention Network (HAN), outperform previously proposed models on our manually labelled test dataset, with an F1 score of 71%. Based on the results and the vast number of GitHub issues, our model potentially identifies about 191 036 security-related issues with prediction power over 80%.
更多
查看译文
关键词
Machine Learning, Open-Source Software, Vulnerabilities, Semi-supervised Learning, Classification
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要