A Model of the Commit Size Distribution of Open Source
This work helps improve software development tools and understanding of software development by measuring and modeling a fundamental dimension of programming.
The authors tackled the problem of modeling the probabilistic distribution of commit sizes in open source projects, showing that their model is applicable across different project sizes using graphical and statistical validation methods.
A fundamental unit of work in programming is the code contribution ("commit") that a developer makes to the code base of the project in work. We use statistical methods to derive a model of the probabilistic distribution of commit sizes in open source projects and we show that the model is applicable to different project sizes. We use both graphical as well as statistical methods to validate the goodness of fit of our model. By measuring and modeling a fundamental dimension of programming we help improve software development tools and our understanding of software development.