LGSDASMLNov 7, 2019

Change your singer: a transfer learning generative adversarial framework for song to song conversion

arXiv:1911.02933v21 citations
Originality Incremental advance
AI Analysis

This addresses the need for high-quality, artist-specific song conversion in music production and entertainment, though it is incremental as it builds on existing voice conversion and GAN methods.

The paper tackled the problem of converting songs to sound like they are sung by a different artist, proposing SCM-GAN, a non-parallel song conversion system that achieved a 35% improvement in global variance and 13% in modulation spectra, with subjective results showing 70% similarity to the target singer and high naturalness.

Have you ever wondered how a song might sound if performed by a different artist? In this work, we propose SCM-GAN, an end-to-end non-parallel song conversion system powered by generative adversarial and transfer learning that allows users to listen to a selected target singer singing any song. SCM-GAN first separates songs into vocals and instrumental music using a U-Net network, then converts the vocal segments to the target singer using advanced CycleGAN-VC, before merging the converted vocals with their corresponding background music. SCM-GAN is first initialized with feature representations learned from a state-of-the-art voice-to-voice conversion and then trained on a dataset of non-parallel songs. Furthermore, SCM-GAN is evaluated against a set of metrics including global variance GV and modulation spectra MS on the 24 Mel-cepstral coefficients (MCEPs). Transfer learning improves the GV by 35% and the MS by 13% on average. A subjective comparison is conducted to test the user satisfaction with the quality and the naturalness of the conversion. Results show above par similarity between SCM-GAN's output and the target (70\% on average) as well as great naturalness of the converted songs.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes