LG AI CVNov 24, 2020

The Interpretable Dictionary in Sparse Coding

Edward Kim, Connor Onweller, Andrew O'Brien, Kathleen McCoy

arXiv:2011.11805v12.31 citations

Originality Incremental advance

AI Analysis

This work tackles the problem of improving the interpretability of neural networks, which is a significant challenge for researchers and practitioners who need to understand the internal workings of AI models.

This paper addresses the interpretability of Artificial Neural Networks (ANNs) by demonstrating that an ANN trained with sparse coding, under specific sparsity constraints, results in a more interpretable model than standard deep learning. The dictionary learned by sparse coding is more easily understood, and its activations create selective feature outputs, showing both qualitative and quantitative benefits in interpretation compared to an equivalent convolutional autoencoder.

Artificial neural networks (ANNs), specifically deep learning networks, have often been labeled as black boxes due to the fact that the internal representation of the data is not easily interpretable. In our work, we illustrate that an ANN, trained using sparse coding under specific sparsity constraints, yields a more interpretable model than the standard deep learning model. The dictionary learned by sparse coding can be more easily understood and the activations of these elements creates a selective feature output. We compare and contrast our sparse coding model with an equivalent feed forward convolutional autoencoder trained on the same data. Our results show both qualitative and quantitative benefits in the interpretation of the learned sparse coding dictionary as well as the internal activation representations.

View on arXiv PDF

Similar