LG AI CVNov 11, 2021

Defining and Quantifying the Emergence of Sparse Concepts in DNNs

Jie Ren, Mingjie Li, Qirui Chen, Huiqi Deng, Quanshi Zhang

arXiv:2111.06206v616.848 citationsHas Code

Originality Incremental advance

AI Analysis

This provides a method to interpret DNNs for researchers and practitioners, but it is incremental as it builds on existing concept-based explanation approaches.

The paper tackles the problem of explaining deep neural networks (DNNs) by showing that their inference scores can be disentangled into effects from a few interactive concepts, which are represented as a sparse causal graph, and it proves this graph can mimic DNN outputs on an exponential number of masked samples.

This paper aims to illustrate the concept-emerging phenomenon in a trained DNN. Specifically, we find that the inference score of a DNN can be disentangled into the effects of a few interactive concepts. These concepts can be understood as causal patterns in a sparse, symbolic causal graph, which explains the DNN. The faithfulness of using such a causal graph to explain the DNN is theoretically guaranteed, because we prove that the causal graph can well mimic the DNN's outputs on an exponential number of different masked samples. Besides, such a causal graph can be further simplified and re-written as an And-Or graph (AOG), without losing much explanation accuracy.

View on arXiv PDF Code

Similar