LGMay 19, 2025

RoFL: Robust Fingerprinting of Language Models

arXiv:2505.12682v17 citationsh-index: 51
Originality Incremental advance
AI Analysis

This addresses the need for model developers to enforce licensing terms in AI, though it is incremental as it builds on prior fingerprinting and watermarking techniques.

The paper tackles the problem of identifying license violations by detecting whether an API or product uses a specific large language model (LLM) or its adapted version, presenting a method that uses robust statistical fingerprints for black-box identification without impacting model quality, and it substantially outperforms prior methods in experiments.

AI developers are releasing large language models (LLMs) under a variety of different licenses. Many of these licenses restrict the ways in which the models or their outputs may be used. This raises the question how license violations may be recognized. In particular, how can we identify that an API or product uses (an adapted version of) a particular LLM? We present a new method that enable model developers to perform such identification via fingerprints: statistical patterns that are unique to the developer's model and robust to common alterations of that model. Our method permits model identification in a black-box setting using a limited number of queries, enabling identification of models that can only be accessed via an API or product. The fingerprints are non-invasive: our method does not require any changes to the model during training, hence by design, it does not impact model quality. Empirically, we find our method provides a high degree of robustness to common changes in the model or inference settings. In our experiments, it substantially outperforms prior art, including invasive methods that explicitly train watermarks into the model.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes