SDLGASFeb 9, 2021

Enhancing Audio Augmentation Methods with Consistency Learning

arXiv:2102.05151v38 citations
AI Analysis

This is an incremental improvement for audio classification tasks using deep learning.

The paper tackled the problem of improving audio classification by explicitly enforcing consistency to data augmentations in the loss function, showing that this approach enhances performance over standard cross-entropy training.

Data augmentation is an inexpensive way to increase training data diversity and is commonly achieved via transformations of existing data. For tasks such as classification, there is a good case for learning representations of the data that are invariant to such transformations, yet this is not explicitly enforced by classification losses such as the cross-entropy loss. This paper investigates the use of training objectives that explicitly impose this consistency constraint and how it can impact downstream audio classification tasks. In the context of deep convolutional neural networks in the supervised setting, we show empirically that certain measures of consistency are not implicitly captured by the cross-entropy loss and that incorporating such measures into the loss function can improve the performance of audio classification systems. Put another way, we demonstrate how existing augmentation methods can further improve learning by enforcing consistency.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes