Understanding and Tackling Label Errors in Deep Learning-Based Vulnerability Detection (Experience Paper)

PROCEEDINGS OF THE 32ND ACM SIGSOFT INTERNATIONAL SYMPOSIUM ON SOFTWARE TESTING AND ANALYSIS, ISSTA 2023(2023)

引用 5|浏览28
暂无评分
摘要
Software system complexity and security vulnerability diversity are plausible sources of the persistent challenges in software vulnerability research. Applying deep learning methods for automatic vulnerability detection has been proven an effective means to complement traditional detection approaches. Unfortunately, lacking well-qualified benchmark datasets could critically restrict the effectiveness of deep learning-based vulnerability detection techniques. Specifically, the long-term existence of erroneous labels in the existing vulnerability datasets may lead to inaccurate, biased, and even flawed results. In this paper, we aim to obtain an in-depth understanding and explanation of the label error causes. To this end, we systematically analyze the diversified datasets used by state-of-the-art learning-based vulnerability detection approaches, and examine their techniques for collecting vulnerable source code datasets. We find that label errors heavily impact the mainstream vulnerability detection models, with a worst-case average F1 drop of 20.7%. As mitigation, we introduce two approaches to dataset denoising, which will enhance the model performance by an average of 10.4%. Leveraging dataset denoising methods, we provide a feasible solution to obtain high-quality labeled datasets.
更多
查看译文
关键词
deep learning,vulnerability detection,denoising
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要