Varangian: A Git Bot for Augmented Static Analysis

2022 MINING SOFTWARE REPOSITORIES CONFERENCE (MSR 2022)(2022)

引用 1|浏览42
暂无评分
摘要
The complexity and scale of modern software programs often lead to overlooked programming errors and security vulnerabilities. Developers often rely on automatic tools, like static analysis tools, to look for bugs and vulnerabilities. Static analysis tools are widely used because they can understand nontrivial program behaviors, scale to millions of lines of code, and detect subtle bugs. However, they are known to generate an excess of false alarms which hinder their utilization as it is counterproductive for developers to go through a long list of reported issues, only to find a few true positives. One of the ways proposed to suppress false positives is to use machine learning to identify them. However, training machine learning models requires good quality labeled datasets. For this purpose, we developed D2A [3], a differential analysis based approach that uses the commit history of a code repository to create a labeled dataset of Infer [2] static analysis output. The data generated by D2A can be used to train AI Models like Voting, Stacking Ensembles and C-BERT [1], which learn to identify False Positives. Ensembles are built on top of the Boosting and Tree based classifiers which use hand-crafted features for classifying static analyzer output as True Positives. C-BERT is a BERT-base language model pretrained from scratch on C source code extracted from 10,000 C repositories, thus leveraging "big code". This model is then fine-tuned on D2A labeled data. This approach views source code as language and automates the extraction of features. The output of the models is a prioritized list of defects ordered by the likelihood of being True Positive. One way to use the Augmented Static Analyzer to improve developer productivity would be to insert the application in the developer workflow so that its use would seem natural. With this idea in mind, we created Varangian, which is a Git bot that automatically creates issues on the repository based on defects prioritized by the Augmented Static Analyzer. Models trained on the D2A dataset created from the same repository are used to create the prioritized list. The models achieve about 0.9 AUC when tested on historical data of 7 different Open Source projects. In terms of FP/TP ratio the improvement was at least 20 times for most projects. It is important to note that the test data will generally contain defects from many commits and will have a different distribution from any individual commit, which may contain fewer and less varied defects. We test the models on the latest commit of a project by first absorbing the commit by the inference pipeline and then manually validating the output. The inference pipeline which produces the prioritized list for a single commit, first applies the Infer static analyzer to the latest commit of the repository. Relevant source code is then extracted based on the bug report for each defect highlighted by the static analyzer. Bug reports and relevant source code extracted from repositories are used to generate features for AI Models, which then assign a likelihood of being a True Positive to each defect. The git bot then takes the defects most likely to be True Positive and creates issues for each defect. The issue created by Varangian has a lot of information the developer can use for debugging. On a recent commit of an opensource project, the Varangian bot created 5 issues out of which 1 was a TP. This gives us an FP/TP ratio of 4/1 which is five times better than the 20/1 ratio we observe for Infer on the test set of the same project. In this presentation, we will showcase Varangian, compare different model training approaches, discuss the challenges involved in building a training and inference pipeline based on code repository, and the impact on performance when moving from the test set to the latest commit.
更多
查看译文
关键词
security,static analysis,git,bot,machine learning,bert
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要