SDAILGASSep 10, 2025

Explainability of CNN Based Classification Models for Acoustic Signal

arXiv:2509.08717v11 citationsh-index: 1ICTAI
Originality Synthesis-oriented
AI Analysis

This work addresses the need for trust and interpretability in bioacoustics, an underexplored domain, but is incremental as it applies existing XAI methods to a new dataset.

The authors tackled the problem of interpreting deep learning models in bioacoustics by applying multiple explainable AI (XAI) techniques to a CNN classifying bird vocalizations, achieving 94.8% accuracy and showing that combined explanations provide more complete insights.

Explainable Artificial Intelligence (XAI) has emerged as a critical tool for interpreting the predictions of complex deep learning models. While XAI has been increasingly applied in various domains within acoustics, its use in bioacoustics, which involves analyzing audio signals from living organisms, remains relatively underexplored. In this paper, we investigate the vocalizations of a bird species with strong geographic variation throughout its range in North America. Audio recordings were converted into spectrogram images and used to train a deep Convolutional Neural Network (CNN) for classification, achieving an accuracy of 94.8\%. To interpret the model's predictions, we applied both model-agnostic (LIME, SHAP) and model-specific (DeepLIFT, Grad-CAM) XAI techniques. These techniques produced different but complementary explanations, and when their explanations were considered together, they provided more complete and interpretable insights into the model's decision-making. This work highlights the importance of using a combination of XAI techniques to improve trust and interoperability, not only in broader acoustics signal analysis but also argues for broader applicability in different domain specific tasks.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes