LGAICLIRSep 4, 2025

Delta Activations: A Representation for Finetuned Large Language Models

arXiv:2509.04442v11 citationsh-index: 5Has Code
Originality Incremental advance
AI Analysis

This work addresses the problem of unstructured model repositories for researchers and practitioners in AI, offering a tool to organize and reuse publicly available LLMs, though it is incremental as it builds on existing finetuning practices.

The paper tackles the challenge of navigating and understanding the vast collection of finetuned Large Language Models (LLMs) by introducing Delta Activations, a method that represents these models as vector embeddings based on shifts in internal activations relative to a base model, enabling effective clustering by domain and task and demonstrating robustness and additive properties.

The success of powerful open source Large Language Models (LLMs) has enabled the community to create a vast collection of post-trained models adapted to specific tasks and domains. However, navigating and understanding these models remains challenging due to inconsistent metadata and unstructured repositories. We introduce Delta Activations, a method to represent finetuned models as vector embeddings by measuring shifts in their internal activations relative to a base model. This representation allows for effective clustering by domain and task, revealing structure in the model landscape. Delta Activations also demonstrate desirable properties: it is robust across finetuning settings and exhibits an additive property when finetuning datasets are mixed. In addition, we show that Delta Activations can embed tasks via few-shot finetuning, and further explore its use for model selection and merging. We hope Delta Activations can facilitate the practice of reusing publicly available models. Code is available at https://github.com/OscarXZQ/delta_activations.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes