SDAIASNov 23, 2021

Upsampling layers for music source separation

arXiv:2111.11773v13 citations
Originality Incremental advance
AI Analysis

This work addresses audio quality issues in music source separation, offering incremental improvements for practitioners.

The paper investigated the impact of upsampling artifacts on music source separation, finding that filtering artifacts from interpolation upsamplers are perceptually preferable despite worse objective scores.

Upsampling artifacts are caused by problematic upsampling layers and due to spectral replicas that emerge while upsampling. Also, depending on the used upsampling layer, such artifacts can either be tonal artifacts (additive high-frequency noise) or filtering artifacts (substractive, attenuating some bands). In this work we investigate the practical implications of having upsampling artifacts in the resulting audio, by studying how different artifacts interact and assessing their impact on the models' performance. To that end, we benchmark a large set of upsampling layers for music source separation: different transposed and subpixel convolution setups, different interpolation upsamplers (including two novel layers based on stretch and sinc interpolation), and different wavelet-based upsamplers (including a novel learnable wavelet layer). Our results show that filtering artifacts, associated with interpolation upsamplers, are perceptually preferrable, even if they tend to achieve worse objective scores.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes