RegMiner: mining replicable regression dataset from code repositories.

Xuezhi Song,Yun Lin,Yijian Wu, Yifan Zhang,Siang Hwee Ng,Xin Peng,Jin Song Dong,Hong Mei

ACM SIGSOFT Conference on the Foundations of Software Engineering (FSE)（2022）

引用 0|浏览72

暂无评分

摘要

In this work, we introduce a tool, RegMiner, to automate the process of collecting replicable regression bugs from a set of Git repositories. In the code commit history, RegMiner searches for regressions where a test can pass a regression-fixing commit, fail a regressioninducing commit, and pass a previous working commit again. Technically, RegMiner (1) identifies potential regression-fixing commits from the code evolution history, (2) migrates the test and its code dependencies in the commit over the history, and (3) minimizes the compilation overhead during the regression search. Our experients show that RegMiner can successfully collect 1035 regressions over 147 projects in 8 weeks, creating the largest replicable regression dataset within the shortest period, to the best of our knowledge. In addition, our experiments further show that (1) RegMiner can construct the regression dataset with very high precision and acceptable recall, and (2) the constructed regression dataset is of high authenticity and diversity. The source code of RegMiner is available at https://github.com/SongXueZhi/RegMiner, the mined regression dataset is available at https://regminer.github.io/, and the demonstration video is available at https://youtu.be/yzcM9Y4unok.

查看译文

关键词

mining replicable regression,code

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要