LGSDASMLDec 1, 2018

SwishNet: A Fast Convolutional Neural Network for Speech, Music and Noise Classification and Segmentation

arXiv:1812.00149v148 citations
Originality Incremental advance
AI Analysis

This provides a lightweight preprocessing tool for audio processing pipelines, particularly useful for embedded systems, though it is incremental as it builds on existing CNN and knowledge distillation techniques.

The authors tackled the problem of classifying and segmenting audio into speech, music, and noise by proposing SwishNet, a fast 1D CNN, which achieved over 97% accuracy in clip classification and over 93% in frame-wise segmentation on the MUSAN corpus.

Speech, Music and Noise classification/segmentation is an important preprocessing step for audio processing/indexing. To this end, we propose a novel 1D Convolutional Neural Network (CNN) - SwishNet. It is a fast and lightweight architecture that operates on MFCC features which is suitable to be added to the front-end of an audio processing pipeline. We showed that the performance of our network can be improved by distilling knowledge from a 2D CNN, pretrained on ImageNet. We investigated the performance of our network on the MUSAN corpus - an openly available comprehensive collection of noise, music and speech samples, suitable for deep learning. The proposed network achieved high overall accuracy in clip (length of 0.5-2s) classification (>97% accuracy) and frame-wise segmentation (>93% accuracy) tasks with even higher accuracy (>99%) in speech/non-speech discrimination task. To verify the robustness of our model, we trained it on MUSAN and evaluated it on a different corpus - GTZAN and found good accuracy with very little fine-tuning. We also demonstrated that our model is fast on both CPU and GPU, consumes a low amount of memory and is suitable for implementation in embedded systems.

Code Implementations4 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes