SDLGASMay 18, 2025

Discovering and Steering Interpretable Concepts in Large Generative Music Models

MIT
arXiv:2505.18186v25 citationsh-index: 11
Originality Incremental advance
AI Analysis

This work provides an empirical tool for improving transparency and uncovering organizing principles in generative music models, which could benefit researchers in AI and music theory.

The authors tackled the problem of interpreting large generative music models by introducing a method using sparse autoencoders to discover interpretable concepts, revealing both familiar musical patterns and novel uncodified ones, and demonstrated these concepts can steer model generations.

The fidelity with which neural networks can now generate content such as music presents a scientific opportunity: these systems appear to have learned implicit theories of such content's structure through statistical learning alone. This offers a potentially new lens on theories of human-generated media. When internal representations align with traditional constructs (e.g. chord progressions in music), they show how such categories can emerge from statistical regularities; when they diverge, they expose limits of existing frameworks and patterns we may have overlooked but that nonetheless carry explanatory power. In this paper, focusing on music generators, we introduce a method for discovering interpretable concepts using sparse autoencoders (SAEs), extracting interpretable features from the residual stream of a transformer model. We make this approach scalable and evaluable using automated labeling and validation pipelines. Our results reveal both familiar musical concepts and coherent but uncodified patterns lacking clear counterparts in theory or language. As an extension, we show such concepts can be used to steer model generations. Beyond improving model transparency, our work provides an empirical tool for uncovering organizing principles that have eluded traditional methods of analysis and synthesis.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes