Audio-only Bird Species Automated Identification Method with Limited Training Data Based on Multi-Channel Deep Convolutional Neural Networks
This work addresses the problem of automated bird species identification for ecological monitoring, but it is incremental as it adapts existing methods to a specific domain with efficiency improvements.
The paper tackles bird species identification from audio with limited training data by proposing a multi-channel deep convolutional neural network based on transfer learning, achieving a mean average precision of 0.9998 with only 13,110 parameters, which is 0.0082% of the original VGG16 model's size.
Based on the transfer learning, we design a bird species identification model that uses the VGG-16 model (pretrained on ImageNet) for feature extraction, then a classifier consisting of two fully-connected hidden layers and a Softmax layer is attached. We compare the performance of the proposed model with the original VGG16 model. The results show that the former has higher train efficiency, but lower mean average precisions(MAP). To improve the MAP of the proposed model, we investigate the result fusion mode to form multi-channel identification model, the best MAP reaches 0.9998. The number of model parameters is 13110, which is only 0.0082% of the VGG16 model. Also, the size demand of sample is decreased.