SEApr 30

The Grand Software Supply Chain of AI Systems

arXiv:2604.2778170.5Has Code
Predicted impact top 25% in SE · last 90 daysOriginality Incremental advance
AI Analysis

For AI system developers and security researchers, this paper highlights critical integrity vulnerabilities in the AI software supply chain that are currently unaddressed.

The paper analyzes the AI software supply chain across four architectural layers, identifying four structural gaps (verifiability, versioning, observability, traceability) that current mechanisms fail to address. A reference stack of 48 projects was found to have 4,664 direct dependencies, 11,508 transitive packages, and ~392M lines of code.

AI systems rest on software with low integrity mechanisms, leaving AI systems exposed across every stage from data acquisition to final inference. This paper makes the AI supply chain a first-class object of analysis, decomposing it across four architectural layers: data acquisition, model training, model inference, and a cross-cutting substrate. Within these layers, we identify four structural gaps that traditional supply chain mechanisms do not address: verifiability, versioning, observability, and traceability.Current AI systems fall short on all of them: they carry undeclared behavioral couplings that no resolver enforces; they cannot be reverted back to known working assemblies; they degrade silently rather than surfacing breaking changes; and their lineage can hardly be approximated. To illustrate the scale of the software supply chain of AI, we measure a reference stack of 48 production-grade open-source projects, which declares 4,664 direct dependencies, resolves to 11,508 transitive packages, and totals roughly 392M lines of code.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes