Low-Rank Hidden State Embeddings for Viterbi Sequence Labeling
This addresses the limitation of hand-designed graphical models in sequence labeling for NLP tasks, offering an incremental improvement with interpretable latent structure.
The paper tackles the problem of representing long-term output dependencies in sequence labeling by learning embedded representations of latent output structure, achieving accuracy improvements in a synthetic CoNLL named entity recognition task.
In textual information extraction and other sequence labeling tasks it is now common to use recurrent neural networks (such as LSTM) to form rich embedded representations of long-term input co-occurrence patterns. Representation of output co-occurrence patterns is typically limited to a hand-designed graphical model, such as a linear-chain CRF representing short-term Markov dependencies among successive labels. This paper presents a method that learns embedded representations of latent output structure in sequence data. Our model takes the form of a finite-state machine with a large number of latent states per label (a latent variable CRF), where the state-transition matrix is factorized---effectively forming an embedded representation of state-transitions capable of enforcing long-term label dependencies, while supporting exact Viterbi inference over output labels. We demonstrate accuracy improvements and interpretable latent structure in a synthetic but complex task based on CoNLL named entity recognition.