SDAILGMMASJan 30, 2023

SingSong: Generating musical accompaniments from singing

Stanford
arXiv:2301.12662v177 citationsh-index: 51
Originality Incremental advance
AI Analysis

This offers musicians and non-musicians an intuitive way to create music with their voice, though it is incremental as it builds on existing source separation and audio generation methods.

The paper tackles the problem of generating instrumental accompaniments from input vocals, presenting SingSong, which adapts AudioLM for conditional audio-to-audio generation and achieves listener preference over a retrieval baseline.

We present SingSong, a system that generates instrumental music to accompany input vocals, potentially offering musicians and non-musicians alike an intuitive new way to create music featuring their own voice. To accomplish this, we build on recent developments in musical source separation and audio generation. Specifically, we apply a state-of-the-art source separation algorithm to a large corpus of music audio to produce aligned pairs of vocals and instrumental sources. Then, we adapt AudioLM (Borsos et al., 2022) -- a state-of-the-art approach for unconditional audio generation -- to be suitable for conditional "audio-to-audio" generation tasks, and train it on the source-separated (vocal, instrumental) pairs. In a pairwise comparison with the same vocal inputs, listeners expressed a significant preference for instrumentals generated by SingSong compared to those from a strong retrieval baseline. Sound examples at https://g.co/magenta/singsong

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes