SEJun 20, 2018

A Large-Scale Study on Source Code Reviewer Recommendation

arXiv:1806.07619v133 citations
Originality Synthesis-oriented
AI Analysis

This addresses the time-consuming task of finding appropriate code reviewers in software development, but it is incremental as it compares existing methods rather than introducing new ones.

This paper conducted a large-scale study comparing two source code reviewer recommendation algorithms (RevFinder and a Naive Bayes-based approach) using data from 51 projects with over 293K pull requests, finding that no single model works best across all projects and that repository type and sub-project information affect results.

Context: Software code reviews are an important part of the development process, leading to better software quality and reduced overall costs. However, finding appropriate code reviewers is a complex and time-consuming task. Goals: In this paper, we propose a large-scale study to compare performance of two main source code reviewer recommendation algorithms (RevFinder and a Naive Bayes-based approach) in identifying the best code reviewers for opened pull requests. Method: We mined data from Github and Gerrit repositories, building a large dataset of 51 projects, with more than 293K pull requests analyzed, 180K owners and 157K reviewers. Results: Based on the large analysis, we can state that i) no model can be generalized as best for all projects, ii) the usage of a different repository (Gerrit, GitHub) can have impact on the the recommendation results, iii) exploiting sub-projects information available in Gerrit can improve the recommendation results.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes