Learning Stochastic Recurrent Networks
This work addresses the need for more flexible and probabilistic modeling in sequential data for applications like music and motion analysis, but it is incremental as it builds on existing variational inference techniques.
The authors tackled the problem of enhancing recurrent neural networks by incorporating latent variables to create Stochastic Recurrent Networks (STORNs), which enable structured and multi-modal conditionals and reliable marginal likelihood estimation, achieving competitive performance on polyphonic music and motion capture datasets.
Leveraging advances in variational inference, we propose to enhance recurrent neural networks with latent variables, resulting in Stochastic Recurrent Networks (STORNs). The model i) can be trained with stochastic gradient methods, ii) allows structured and multi-modal conditionals at each time step, iii) features a reliable estimator of the marginal likelihood and iv) is a generalisation of deterministic recurrent neural networks. We evaluate the method on four polyphonic musical data sets and motion capture data.