SEDCApr 13

GitFarm: Git as a Service for Large-Scale Monorepos

arXiv:2604.119773.5
Predicted impact top 99% in SE · last 90 daysOriginality Incremental advance
AI Analysis

For large-scale monorepo users (e.g., Uber's automation systems), GitFarm solves the bottleneck of slow Git operations and high infrastructure overhead.

Uber's large-scale monorepos cause Git workflow bottlenecks like multi-minute clone times and high server load. GitFarm eliminates local clones by executing Git operations remotely in pre-warmed sandboxes, providing ready-to-use checkouts in under a second and eliminating cold starts of up to 15 minutes.

At the scale of Uber's monorepos, traditional Git workflows become a fundamental bottleneck. Cloning multi-gigabyte repositories, maintaining local checkouts, periodically syncing from upstream, and executing repetitive fetch or push operations consume substantial compute and I/O across hundreds of automation systems. Although CI (Continuous Integration) systems such as Jenkins and Buildkite provide caching mechanisms to reduce clone times, in practice, these approaches incur significant infrastructure overhead, manual maintenance, inconsistent cache hit rates, and cold start latencies of several minutes for large monorepos. Moreover, thousands of independent clone and fetch operations add heavy load on upstream Git servers, making them slow and difficult to scale. To address these limitations, we present GitFarm, a platform that provides Git as a stateful, identity-scoped, repository-centric execution service through a gRPC API. GitFarm decouples repository management from clients by executing Git operations remotely within secure, ephemeral sandboxes backed by pre-warmed repositories. The system enforces identity-scoped authorization, supports multi-command workflows, and leverages specialized backend clusters for workload isolation. For clients, this design eliminates local clones, provides a ready-to-use checkout in less than a second, and significantly lowers client-side compute and I/O overhead by offloading operations to GitFarm. Also, client services no longer experience cold starts (up to 15 minutes) due to initial clones of the monorepos on each host. The results demonstrate that Git as a service provides substantial performance and cost benefits, while preserving the flexibility of native Git semantics.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes