LGApr 9, 2025

Clustering and novel class recognition: evaluating bioacoustic deep learning feature extractors

Vincent S. Kather, Burooj Ghani, Dan Stowell

arXiv:2504.06710v115.711 citationsh-index: 8Proceedings of the 11th Convention of the European Acoustics Association Forum Acusticum / EuroNoise 2025

Originality Synthesis-oriented

AI Analysis

This work addresses a gap in computational bioacoustics for researchers needing to compare models across taxonomic groups, though it is incremental as it builds on existing evaluation methods.

The paper tackled the problem of comparing bioacoustic deep learning models trained on different species by evaluating feature extractors through clustering and kNN classification, finding that this approach allows assessment of adaptability beyond training classes.

In computational bioacoustics, deep learning models are composed of feature extractors and classifiers. The feature extractors generate vector representations of the input sound segments, called embeddings, which can be input to a classifier. While benchmarking of classification scores provides insights into specific performance statistics, it is limited to species that are included in the models' training data. Furthermore, it makes it impossible to compare models trained on very different taxonomic groups. This paper aims to address this gap by analyzing the embeddings generated by the feature extractors of 15 bioacoustic models spanning a wide range of setups (model architectures, training data, training paradigms). We evaluated and compared different ways in which models structure embedding spaces through clustering and kNN classification, which allows us to focus our comparison on feature extractors independent of their classifiers. We believe that this approach lets us evaluate the adaptability and generalization potential of models going beyond the classes they were trained on.

View on arXiv PDF

Similar