SDAIASJun 11, 2025

Fine-Grained control over Music Generation with Activation Steering

arXiv:2506.10225v1h-index: 2
Originality Incremental advance
AI Analysis

This provides incremental improvements for users needing both global and local control in music generation systems.

The researchers tackled the problem of achieving fine-grained control over music generation by developing an inference-time intervention method for the MusicGen transformer, enabling timbre transfer, style transfer, and genre fusion through activation steering, with improved performance from modeling it as a regression task.

We present a method for fine-grained control over music generation through inference-time interventions on an autoregressive generative music transformer called MusicGen. Our approach enables timbre transfer, style transfer, and genre fusion by steering the residual stream using weights of linear probes trained on it, or by steering the attention layer activations in a similar manner. We observe that modelling this as a regression task provides improved performance, hypothesizing that the mean-squared-error better preserve meaningful directional information in the activation space. Combined with the global conditioning offered by text prompts in MusicGen, our method provides both global and local control over music generation. Audio samples illustrating our method are available at our demo page.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes