An empirical study on code review activity prediction in practice
arxiv(2024)
摘要
During code reviews, an essential step in software quality assurance,
reviewers have the difficult task of understanding and evaluating code changes
to validate their quality and prevent introducing faults to the codebase. This
is a tedious process where the effort needed is highly dependent on the code
submitted, as well as the author's and the reviewer's experience, leading to
median wait times for review feedback of 15-64 hours. Through an initial user
study carried with 29 experts, we found that re-ordering the files changed by a
patch within the review environment has potential to improve review quality, as
more comments are written (+23
precision and recall increases to 53
compared to the alphanumeric ordering. Hence, this paper aims to help code
reviewers by predicting which files in a submitted patch need to be (1)
commented, (2) revised, or (3) are hot-spots (commented or revised). To predict
these tasks, we evaluate two different types of text embeddings (i.e.,
Bag-of-Words and Large Language Models encoding) and review process features
(i.e., code size-based and history-based features). Our empirical study on
three open-source and two industrial datasets shows that combining the code
embedding and review process features leads to better results than the
state-of-the-art approach. For all tasks, F1-scores (median of 40-62
significantly better than the state-of-the-art (from +1 to +9
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要