LGSDASMLApr 8, 2018

Environmental Sound Recognition using Masked Conditional Neural Networks

arXiv:1804.02665v26 citations
AI Analysis

This work addresses sound recognition for environmental monitoring, but it is incremental as it builds on existing neural network methods.

The authors tackled environmental sound recognition by proposing a Masked Conditional Neural Network (MCLNN) that learns in frequency bands and automates feature exploration, achieving competitive accuracies on the ESC-10 dataset.

Neural network based architectures used for sound recognition are usually adapted from other application domains, which may not harness sound related properties. The ConditionaL Neural Network (CLNN) is designed to consider the relational properties across frames in a temporal signal, and its extension the Masked ConditionaL Neural Network (MCLNN) embeds a filterbank behavior within the network, which enforces the network to learn in frequency bands rather than bins. Additionally, it automates the exploration of different feature combinations analogous to handcrafting the optimum combination of features for a recognition task. We applied the MCLNN to the environmental sounds of the ESC-10 dataset. The MCLNN achieved competitive accuracies compared to state-of-the-art convolutional neural networks and hand-crafted attempts.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes