A Novel Approach for Estimating Truck Factors
This work addresses a gap in software engineering by providing a practical method for assessing developer turnover risk, though it is incremental as it builds on an existing metric without fundamentally changing the field.
The authors tackled the lack of consensus and evidence for calculating Truck Factors (TF), a metric for knowledge concentration in software projects, by proposing an automated approach tested on 133 GitHub projects and surveying developers, finding that 65% of systems have TF <= 2 and 53% of survey responses positively validated their estimates.
Truck Factor (TF) is a metric proposed by the agile community as a tool to identify concentration of knowledge in software development environments. It states the minimal number of developers that have to be hit by a truck (or quit) before a project is incapacitated. In other words, TF helps to measure how prepared is a project to deal with developer turnover. Despite its clear relevance, few studies explore this metric. Altogether there is no consensus about how to calculate it, and no supporting evidence backing estimates for systems in the wild. To mitigate both issues, we propose a novel (and automated) approach for estimating TF-values, which we execute against a corpus of 133 popular project in GitHub. We later survey developers as a means to assess the reliability of our results. Among others, we find that the majority of our target systems (65%) have TF <= 2. Surveying developers from 67 target systems provides confidence towards our estimates; in 84% of the valid answers we collect, developers agree or partially agree that the TF's authors are the main authors of their systems; in 53% we receive a positive or partially positive answer regarding our estimated truck factors.