SEApr 22, 2019

Communication and Code Dependency Effects on Software Code Quality: An Empirical Analysis of Herbsleb Hypothesis

arXiv:1904.09954v38 citationsHas Code
Originality Incremental advance
AI Analysis

This addresses software engineering practices by showing that communication flows are more critical than hero developers for code quality, offering insights for project management in open source communities.

The study tested Herbsleb's hypothesis by analyzing over 1000 open source projects, finding that social communication metrics are a stronger predictor of code quality (bugs) than reliance on 'hero' developers.

Prior literature has suggested that in many projects 80\% or more of the contributions are made by a small called group of around 20% of the development team. Most prior studies deprecate a reliance on such a small inner group of "heroes", arguing that it causes bottlenecks in development and communication. Despite this, such projects are very common in open source projects. So what exactly is the impact of "heroes" in code quality? Herbsleb argues that if code is strongly connected yet their developers are not, then that code will be buggy. To test the Hersleb hypothesis, we develop and apply two metrics of (a) "social-ness'"and (b) "hero-ness" that measure (a) how much one developer comments on the issues of another; and (b) how much one developer changes another developer's code (and "heroes" are those that change the most code, all around the system). In a result endorsing the Hersleb hypothesis, in over 1000 open source projects, we find that "social-ness" is a statistically stronger indicate for code quality (number of bugs) than "hero-ness". Hence we say that debates over the merits of "hero-ness" is subtly misguided. Our results suggest that the real benefits of these so-called "heroes" is not so much the code they generate but the pattern of communication required when the interaction between a large community of programmers passes through a small group of centralized developers. To say that another way, to build better code, build better communication flows between core developers and the rest. In order to allow other researchers to confirm/improve/refute our results, all our scripts and data are available, on-line at https://github.com/Anonymous633671/A-Comparison-on-Communication-and-Code-Dependency-Effects-on-Software-Code-Quality.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes