SDASMar 4, 2019

Data Augmentation for Drum Transcription with Convolutional Neural Networks

arXiv:1903.01416v112 citations
Originality Synthesis-oriented
AI Analysis

This addresses the issue of limited annotated data for researchers and practitioners in audio processing, but it is incremental as it applies existing augmentation techniques to a specific domain.

The paper tackles the problem of data scarcity in drum transcription by investigating data augmentation strategies, showing that a CNN-based transcription algorithm benefits from these methods.

A recurrent issue in deep learning is the scarcity of data, in particular precisely annotated data. Few publicly available databases are correctly annotated and generating correct labels is very time consuming. The present article investigates into data augmentation strategies for Neural Networks training, particularly for tasks related to drum transcription. These tasks need very precise annotations. This article investigates state-of-the-art sound transformation algorithms for remixing noise and sinusoidal parts, remixing attacks, transposing with and without time compensation and compares them to basic regularization methods such as using dropout and additive Gaussian noise. And it shows how a drum transcription algorithm based on CNN benefits from the proposed data augmentation strategy.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes