SDLGASDec 6, 2020

Source Separation and Depthwise Separable Convolutions for Computer Audition

arXiv:2012.03359v1
Originality Incremental advance
AI Analysis

This work addresses the problem of improving audio classification performance for computer audition, particularly in scenarios with limited data, which is relevant for researchers and practitioners working with audio analysis.

This paper explores a feature representation method that combines source separation with depthwise separable convolutions for computer audition. The authors demonstrate that source separation improves classification performance in a limited-data setting compared to using standard single spectrograms.

Given recent advances in deep music source separation, we propose a feature representation method that combines source separation with a state-of-the-art representation learning technique that is suitably repurposed for computer audition (i.e. machine listening). We train a depthwise separable convolutional neural network on a challenging electronic dance music (EDM) data set and compare its performance to convolutional neural networks operating on both source separated and standard spectrograms. It is shown that source separation improves classification performance in a limited-data setting compared to the standard single spectrogram approach.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes