MLCLLGJun 16, 2016

Increasing the Interpretability of Recurrent Neural Networks Using Hidden Markov Models

arXiv:1606.05320v271 citations
Originality Incremental advance
AI Analysis

This work addresses interpretability for users of RNNs in domains like speech recognition and translation, but it is incremental as it builds on existing hybrid modeling approaches.

The paper tackled the problem of making recurrent neural networks (RNNs) more interpretable by combining them with hidden Markov models (HMMs), resulting in findings that LSTMs and HMMs learn complementary information about text features.

As deep neural networks continue to revolutionize various application domains, there is increasing interest in making these powerful models more understandable and interpretable, and narrowing down the causes of good and bad predictions. We focus on recurrent neural networks (RNNs), state of the art models in speech recognition and translation. Our approach to increasing interpretability is by combining an RNN with a hidden Markov model (HMM), a simpler and more transparent model. We explore various combinations of RNNs and HMMs: an HMM trained on LSTM states; a hybrid model where an HMM is trained first, then a small LSTM is given HMM state distributions and trained to fill in gaps in the HMM's performance; and a jointly trained hybrid model. We find that the LSTM and HMM learn complementary information about the features in the text.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes