LGMLAug 5, 2020

FRMDN: Flow-based Recurrent Mixture Density Network

arXiv:2008.02144v3
AI Analysis

This work addresses sequence modeling for applications like image and speech processing, but it is incremental as it builds on existing recurrent mixture density networks.

The authors tackled the problem of improving probabilistic sequence modeling by generalizing recurrent mixture density networks with normalizing flows, resulting in significant log-likelihood gains on image and speech data compared to state-of-the-art methods.

The class of recurrent mixture density networks is an important class of probabilistic models used extensively in sequence modeling and sequence-to-sequence mapping applications. In this class of models, the density of a target sequence in each time-step is modeled by a Gaussian mixture model with the parameters given by a recurrent neural network. In this paper, we generalize recurrent mixture density networks by defining a Gaussian mixture model on a non-linearly transformed target sequence in each time-step. The non-linearly transformed space is created by normalizing flow. We observed that this model significantly improves the fit to image sequences measured by the log-likelihood. We also applied the proposed model on some speech and image data, and observed that the model has significant modeling power outperforming other state-of-the-art methods in terms of the log-likelihood.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes