Why Developers Refactor Source Code: A Mining-based Study
This research provides a generalized understanding of refactoring motivations for software developers, complementing previous survey-based studies.
This study investigates developer motivations for refactoring by analyzing 287,813 refactoring operations across 150 open-source projects and manually classifying motivations from 551 pull requests. It provides quantitative evidence linking refactoring to process/product metrics and a detailed taxonomy of refactoring motivations.
Refactoring aims at improving code non-functional attributes without modifying its external behavior. Previous studies investigated the motivations behind refactoring by surveying developers. With the aim of generalizing and complementing their findings, we present a large-scale study quantitatively and qualitatively investigating why developers perform refactoring in open source projects. First, we mine 287,813 refactoring operations performed in the history of 150 systems. Using this dataset, we investigate the interplay between refactoring operations and process (e.g., previous changes/fixes) and product (e.g., quality metrics) metrics. Then, we manually analyze 551 merged pull requests implementing refactoring operations and classify the motivations behind the implemented refactorings (e.g., removal of code duplication). Our results led to (i) quantitative evidence of the relationship existing between certain process/product metrics and refactoring operations and (ii) a detailed taxonomy, generalizing and complementing the ones existing in the literature, of motivations pushing developers to refactor source code.