MLLGSDASMar 6, 2018

Masked Conditional Neural Networks for Audio Classification

arXiv:1803.02421v218 citations
AI Analysis

This work addresses audio classification for music recognition, presenting an incremental improvement over existing methods.

The authors tackled audio classification by introducing the Masked Conditional Neural Network (MCLNN), which uses a binary mask to preserve spatial locality and automate feature exploration, achieving competitive recognition accuracies on GTZAN and ISMIR2004 music datasets that surpass state-of-the-art neural network and hand-crafted methods.

We present the ConditionaL Neural Network (CLNN) and the Masked ConditionaL Neural Network (MCLNN) designed for temporal signal recognition. The CLNN takes into consideration the temporal nature of the sound signal and the MCLNN extends upon the CLNN through a binary mask to preserve the spatial locality of the features and allows an automated exploration of the features combination analogous to hand-crafting the most relevant features for the recognition task. MCLNN has achieved competitive recognition accuracies on the GTZAN and the ISMIR2004 music datasets that surpass several state-of-the-art neural network based architectures and hand-crafted methods applied on both datasets.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes