LGCRMay 28, 2025

Machine Learning Models Have a Supply Chain Problem

DeepMind
arXiv:2505.22778v14 citationsh-index: 26Has CodeICML
Originality Synthesis-oriented
AI Analysis

This addresses security vulnerabilities for users of open ML models, though it is incremental by applying existing software supply-chain solutions to the ML domain.

The paper identifies significant supply-chain risks in the open machine learning model ecosystem, such as malicious replacements or training on poisoned data, and proposes using Sigstore to enable model signing and dataset verification for transparency.

Powerful machine learning (ML) models are now readily available online, which creates exciting possibilities for users who lack the deep technical expertise or substantial computing resources needed to develop them. On the other hand, this type of open ecosystem comes with many risks. In this paper, we argue that the current ecosystem for open ML models contains significant supply-chain risks, some of which have been exploited already in real attacks. These include an attacker replacing a model with something malicious (e.g., malware), or a model being trained using a vulnerable version of a framework or on restricted or poisoned data. We then explore how Sigstore, a solution designed to bring transparency to open-source software supply chains, can be used to bring transparency to open ML models, in terms of enabling model publishers to sign their models and prove properties about the datasets they use.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes