SDLGASFeb 23, 2022

Listen to Interpret: Post-hoc Interpretability for Audio Networks with NMF

arXiv:2202.11479v230 citations
Originality Incremental advance
AI Analysis

This addresses the problem of making AI decisions interpretable for end-users in audio applications, though it is incremental as it builds on existing NMF and interpretability techniques.

The paper tackles post-hoc interpretability for audio processing networks by developing an interpreter that produces listenable audio-based explanations using non-negative matrix factorization (NMF), demonstrating its applicability on popular benchmarks including a real-world multi-label classification task.

This paper tackles post-hoc interpretability for audio processing networks. Our goal is to interpret decisions of a network in terms of high-level audio objects that are also listenable for the end-user. To this end, we propose a novel interpreter design that incorporates non-negative matrix factorization (NMF). In particular, a carefully regularized interpreter module is trained to take hidden layer representations of the targeted network as input and produce time activations of pre-learnt NMF components as intermediate outputs. Our methodology allows us to generate intuitive audio-based interpretations that explicitly enhance parts of the input signal most relevant for a network's decision. We demonstrate our method's applicability on popular benchmarks, including a real-world multi-label classification task.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes