SD AI CV ASFeb 25, 2025

From Vision to Sound: Advancing Audio Anomaly Detection with Vision-Based Algorithms

Manuel Barusco, Francesco Borsatti, Davide Dalle Pezze, Francesco Paissan, Elisabetta Farella, Gian Antonio Susto

arXiv:2502.18328v19.32 citationsh-index: 8

Originality Incremental advance

AI Analysis

This addresses the problem of making audio anomaly detection more interpretable for industrial and environmental applications, though it is incremental as it adapts existing vision methods.

The paper tackles audio anomaly detection by adapting vision-based anomaly detection algorithms to localize anomalies in spectrograms, improving explainability with fine-grained temporal-frequency localization.

Recent advances in Visual Anomaly Detection (VAD) have introduced sophisticated algorithms leveraging embeddings generated by pre-trained feature extractors. Inspired by these developments, we investigate the adaptation of such algorithms to the audio domain to address the problem of Audio Anomaly Detection (AAD). Unlike most existing AAD methods, which primarily classify anomalous samples, our approach introduces fine-grained temporal-frequency localization of anomalies within the spectrogram, significantly improving explainability. This capability enables a more precise understanding of where and when anomalies occur, making the results more actionable for end users. We evaluate our approach on industrial and environmental benchmarks, demonstrating the effectiveness of VAD techniques in detecting anomalies in audio signals. Moreover, they improve explainability by enabling localized anomaly identification, making audio anomaly detection systems more interpretable and practical.

View on arXiv PDF

Similar