SD LG ASFeb 23, 2022

Listen to Interpret: Post-hoc Interpretability for Audio Networks with NMF

Jayneel Parekh, Sanjeel Parekh, Pavlo Mozharovskyi, Florence d'Alché-Buc, Gaël Richard

arXiv:2202.11479v216.330 citationsHas Code

Originality Incremental advance

AI Analysis

This addresses the problem of making AI decisions interpretable for end-users in audio applications, though it is incremental as it builds on existing NMF and interpretability techniques.

The paper tackles post-hoc interpretability for audio processing networks by developing an interpreter that produces listenable audio-based explanations using non-negative matrix factorization (NMF), demonstrating its applicability on popular benchmarks including a real-world multi-label classification task.

This paper tackles post-hoc interpretability for audio processing networks. Our goal is to interpret decisions of a network in terms of high-level audio objects that are also listenable for the end-user. To this end, we propose a novel interpreter design that incorporates non-negative matrix factorization (NMF). In particular, a carefully regularized interpreter module is trained to take hidden layer representations of the targeted network as input and produce time activations of pre-learnt NMF components as intermediate outputs. Our methodology allows us to generate intuitive audio-based interpretations that explicitly enhance parts of the input signal most relevant for a network's decision. We demonstrate our method's applicability on popular benchmarks, including a real-world multi-label classification task.

View on arXiv PDF Code

Similar