Marine Mammal Species Classification using Convolutional Neural Networks and a Novel Acoustic Representation
This work addresses the problem of automating large-scale acoustic data analysis for marine mammal conservation, presenting an incremental improvement with a new representation tailored to STFT parameter sensitivity.
The paper tackles marine mammal species classification from acoustic recordings by proposing a Convolutional Neural Network and a novel acoustic representation based on stacked spectrograms, achieving detection of whale vocalizations and generalization to additional species through transfer learning.
Research into automated systems for detecting and classifying marine mammals in acoustic recordings is expanding internationally due to the necessity to analyze large collections of data for conservation purposes. In this work, we present a Convolutional Neural Network that is capable of classifying the vocalizations of three species of whales, non-biological sources of noise, and a fifth class pertaining to ambient noise. In this way, the classifier is capable of detecting the presence and absence of whale vocalizations in an acoustic recording. Through transfer learning, we show that the classifier is capable of learning high-level representations and can generalize to additional species. We also propose a novel representation of acoustic signals that builds upon the commonly used spectrogram representation by way of interpolating and stacking multiple spectrograms produced using different Short-time Fourier Transform (STFT) parameters. The proposed representation is particularly effective for the task of marine mammal species classification where the acoustic events we are attempting to classify are sensitive to the parameters of the STFT.