VDTriplet: Vulnerability detection with graph semantics using triplet model

Hao Sun,Lei Cui,Lun Li,Zhenquan Ding,Siyuan Li,Zhiyu Hao,Hongsong Zhu

Computers & Security（2024）

引用 0|浏览25

暂无评分

摘要

This study presents VDTriplet, a novel learning framework for building vulnerability detection models. VDTriplet is the first attempt using deep learning to avoid the potential known vulnerability function misjudgment due to the small difference between vulnerability and its fixed vulnerability function. Unlike prior work that treats the program as sequential tokens or randomly initialized graphs for supervised binary classification detection tasks, our model not only fuses rich syntactic and semantic information to obtain the most accurate program representation, but also utilizes the TripletNN model to reduce misjudgment of potential known vulnerabilities. VDTriplet first extracts the subgraphs that causes the vulnerability through the typical programming errors to reduce redundant code. Then, it uses the pre-trained model and unsupervised model for the graph encoding of subgraphs, thereby minimizing the influence of randomly initialized graph nodes and avoiding the need for supervised labeling. Finally, TripletNN model minimizes the distance between potential vulnerabilities and vulnerabilities with the same vulnerability type, and maximizes the distance between potential vulnerabilities and fixed vulnerabilities to reduce false positives. The results show that the performance of VDTriplet is significantly better than the studied baselines. Compared with the best performing model in the literature, our model achieves a total of 4.89%, 4.23%, 4.56% and 5.34% improvement in Accuracy, Precision, Recall and F1-Score in the test results respectively. Moreover, it exhibits well generalization in detecting new eight applications, demonstrating that it is potentially valuable in practical usage. Overall, this is indeed an outstanding improvement.

查看译文

关键词

Vulnerability detection,Deep learning,Extracting subgraphs,Encoding subgraphs,TripletNN model

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要