MLLGAPCOMEMay 14, 2019

Convolutional Poisson Gamma Belief Network

arXiv:1905.05394v115 citations
Originality Incremental advance
AI Analysis

This addresses the limitation of lossy text representations in machine learning, offering a domain-specific improvement for natural language processing tasks.

The paper tackles the problem of text analysis by proposing a model that directly processes words as sequences of high-dimensional one-hot vectors to capture word order, resulting in high-quality latent representations that can enrich existing models ignoring word order.

For text analysis, one often resorts to a lossy representation that either completely ignores word order or embeds each word as a low-dimensional dense feature vector. In this paper, we propose convolutional Poisson factor analysis (CPFA) that directly operates on a lossless representation that processes the words in each document as a sequence of high-dimensional one-hot vectors. To boost its performance, we further propose the convolutional Poisson gamma belief network (CPGBN) that couples CPFA with the gamma belief network via a novel probabilistic pooling layer. CPFA forms words into phrases and captures very specific phrase-level topics, and CPGBN further builds a hierarchy of increasingly more general phrase-level topics. For efficient inference, we develop both a Gibbs sampler and a Weibull distribution based convolutional variational auto-encoder. Experimental results demonstrate that CPGBN can extract high-quality text latent representations that capture the word order information, and hence can be leveraged as a building block to enrich a wide variety of existing latent variable models that ignore word order.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes