SDAIHCLGASSep 29, 2025

Learning Relationships Between Separate Audio Tracks for Creative Applications

arXiv:2509.25296v1AIMC
Originality Incremental advance
AI Analysis

This work addresses the challenge of creative musical generation for artists and musicians, but it is incremental as it builds on existing methods like Transformers and Wav2Vec 2.0.

The paper tackles the problem of learning musical relationships between separate audio tracks to enable real-time generation of coherent musical outputs from live inputs, demonstrating that their decision module can predict a coherent track B when conditioned on its corresponding guide track A based on a corpus of paired tracks.

This paper presents the first step in a research project situated within the field of musical agents. The objective is to achieve, through training, the tuning of the desired musical relationship between a live musical input and a real-time generated musical output, through the curation of a database of separated tracks. We propose an architecture integrating a symbolic decision module capable of learning and exploiting musical relationships from such musical corpus. We detail an offline implementation of this architecture employing Transformers as the decision module, associated with a perception module based on Wav2Vec 2.0, and concatenative synthesis as audio renderer. We present a quantitative evaluation of the decision module's ability to reproduce learned relationships extracted during training. We demonstrate that our decision module can predict a coherent track B when conditioned by its corresponding ''guide'' track A, based on a corpus of paired tracks (A, B).

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes