SESIAOSOC-PHJun 23, 2015

How do OSS projects change in number and size? A large-scale analysis to test a model of project growth

arXiv:1506.06924v19 citationsHas Code
Originality Synthesis-oriented
AI Analysis

This research provides insights into project growth patterns for the OSS community, though it is incremental as it applies an existing firm growth model to OSS data.

The study analyzed the growth dynamics of Open Source Software projects using a large-scale dataset from SourceForge over 10 years, finding exponential growth in the number of projects and developers, with a notable increase in single-developer projects after 2009, and verified that the Yule-Simon distribution fits the size distribution of collaborative projects except during periods when modeling assumptions failed due to developers founding multiple projects.

Established Open Source Software (OSS) projects can grow in size if new developers join, but also the number of OSS projects can grow if developers choose to found new projects. We discuss to what extent an established model for firm growth can be applied to the dynamics of OSS projects. Our analysis is based on a large-scale data set from SourceForge (SF) consisting of monthly data for 10 years, for up to 360'000 OSS projects and up to 340'000 developers. Over this time period, we find an exponential growth both in the number of projects and developers, with a remarkable increase of single-developer projects after 2009. We analyze the monthly entry and exit rates for both projects and developers, the growth rate of established projects and the monthly project size distribution. To derive a prediction for the latter, we use modeling assumptions of how newly entering developers choose to either found a new project or to join existing ones. Our model applies only to collaborative projects that are deemed to grow in size by attracting new developers. We verify, by a thorough statistical analysis, that the Yule-Simon distribution is a valid candidate for the size distribution of collaborative projects except for certain time periods where the modeling assumptions no longer hold. We detect and empirically test the reason for this limitation, i.e., the fact that an increasing number of established developers found additional new projects after 2009.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes