LGJan 23, 2023

Explaining Deep Learning Hidden Neuron Activations using Concept Induction

arXiv:2301.09611v12 citationsh-index: 58
Originality Incremental advance
AI Analysis

This addresses the problem of explainable AI for researchers and practitioners by offering a systematic approach to lift the black-box nature of deep learning systems, though it is incremental as it builds on existing concept induction techniques.

The paper tackles the challenge of interpreting hidden neuron activations in deep learning by introducing an automated method that uses concept induction and large-scale background knowledge to assign meaningful labels to neurons, demonstrating that it provides meaningful interpretations.

One of the current key challenges in Explainable AI is in correctly interpreting activations of hidden neurons. It seems evident that accurate interpretations thereof would provide insights into the question what a deep learning system has internally \emph{detected} as relevant on the input, thus lifting some of the black box character of deep learning systems. The state of the art on this front indicates that hidden node activations appear to be interpretable in a way that makes sense to humans, at least in some cases. Yet, systematic automated methods that would be able to first hypothesize an interpretation of hidden neuron activations, and then verify it, are mostly missing. In this paper, we provide such a method and demonstrate that it provides meaningful interpretations. It is based on using large-scale background knowledge -- a class hierarchy of approx. 2 million classes curated from the Wikipedia Concept Hierarchy -- together with a symbolic reasoning approach called \emph{concept induction} based on description logics that was originally developed for applications in the Semantic Web field. Our results show that we can automatically attach meaningful labels from the background knowledge to individual neurons in the dense layer of a Convolutional Neural Network through a hypothesis and verification process.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes