Neural Variational Inference for Text Processing
This work addresses the challenge of improving generative and conditional models for text processing, offering a novel approach that enhances performance in applications like document modeling and question answering.
The paper tackles the problem of applying neural variational inference to text processing by introducing a generic framework that uses an inference network conditioned on discrete text input. It achieves state-of-the-art results, including lowest perplexities on document modeling corpora and exceeding previous benchmarks on question answering tasks.
Recent advances in neural variational inference have spawned a renaissance in deep latent variable models. In this paper we introduce a generic variational inference framework for generative and conditional models of text. While traditional variational methods derive an analytic approximation for the intractable distributions over latent variables, here we construct an inference network conditioned on the discrete text input to provide the variational distribution. We validate this framework on two very different text modelling applications, generative document modelling and supervised question answering. Our neural variational document model combines a continuous stochastic document representation with a bag-of-words generative model and achieves the lowest reported perplexities on two standard test corpora. The neural answer selection model employs a stochastic representation layer within an attention mechanism to extract the semantics between a question and answer pair. On two question answering benchmarks this model exceeds all previous published benchmarks.