Data-driven test selection at scale

Sonu Mehta,Farima Farmahinifarahani,Ranjita Bhagwan, Suraj Guptha, Sina Jafari,Rahul Kumar,Vaibhav Saini,Anirudh Santhiar

Foundations of Software Engineering（2021）

引用 2|浏览26

暂无评分

摘要

ABSTRACTLarge-scale services depend on Continuous Integration/Continuous Deployment (CI/CD) processes to maintain their agility and code-quality. Change-based testing plays an important role in finding bugs, but testing after every change is prohibitively expensive at a scale where thousands of changes are committed every hour. Test selection models deal with this issue by running a subset of tests for every change. In this paper, we present a generic, language-agnostic and lightweight statistical model for test selection. Unlike existing techniques, the proposed model does not require complex feature extraction techniques. Consequently, it scales to hundreds of repositories of varying characteristics while capturing more than 99% of buggy pull requests. Additionally, to better evaluate test selection models, we propose application-specific metrics that capture both a reduction in resource cost and a reduction in pull-request turn-around time. By evaluating our model on 22 large repositories at Microsoft, we find that we can save 15%−30% of compute time while reporting back more than ≈99% of buggy pull requests.

查看译文

关键词

test selection, continuous integration, statistical models

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要