A multi-objective effort-aware approach for early code review prediction and prioritization

Empirical Software Engineering（2023）

引用 0|浏览1

暂无评分

摘要

Modern Code Review (MCR) is an essential practice in software engineering. MCR helps with the early detection of defects and preventing poor implementation practices and other benefits such as knowledge sharing, team awareness, and collaboration. However, reviewing code changes is a hard and time-consuming task requiring developers to prioritize code review tasks to optimize their time and effort spent on code review. Previous approaches attempted to prioritize code reviews based on their likelihood to be merged by leveraging Machine learning (ML) models to maximize the prediction performance. However, these approaches did not consider the review effort dimension which results in sub-optimal solutions for code review prioritization. It is thus important to consider the code review effort in code review request prioritization to help developers optimize their code review efforts while maximizing the number of merged code changes. To address this issue, we propose CostAwareCR , a multi-objective optimization-based approach to predict and prioritize code review requests based on their likelihood to be merged, and their review effort measured in terms of the size of the reviewed code. CostAwareCR uses the RuleFit algorithm to learn relevant features. Then, our approach learns Logistic Regression (LR) model weights using the Non-dominated Sorting Genetic Algorithm II (NSGA-II) to simultaneously maximize (1) the prediction performance and, (2) the cost-effectiveness. To evaluate the performance of CostAwareCR , we performed a large empirical study on 146,612 code reviews across 3 large organizations, namely LibreOffice, Eclipse and GerritHub. The obtained results indicate that CostAwareCR achieves promising Area Under the Curve (AUC) scores ranging from 0.75 to 0.77. Additionally, CostAwareCR outperforms various baseline approaches in terms of effort-awareness performance metrics being able to prioritize the review of 87 P_opt ) indicating that our approach is able to provide near-optimal code review prioritization based on the review effort. Our results indicate that our multi-objective formulation is prominent for learning models that provide a trade-off between good cost-effectiveness while keeping promising prediction performance.

查看译文

关键词

Code review,Multi-Objective Optimization,Code review prioritization

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要