ML LGSep 25, 2017

Predictive-State Decoders: Encoding the Future into Recurrent Networks

Arun Venkatraman, Nicholas Rhinehart, Wen Sun, Lerrel Pinto, Martial Hebert, Byron Boots, Kris M. Kitani, J. Andrew Bagnell

arXiv:1709.08520v114.246 citations

Originality Incremental advance

AI Analysis

This incremental improvement addresses the problem of enhancing RNN efficiency and accuracy for researchers and practitioners in dynamic process modeling.

The paper tackles the challenge of modeling latent states in recurrent neural networks (RNNs) by augmenting them with Predictive-State Decoders (PSDs), which supervise the internal state to predict future observations, resulting in improved statistical performance with fewer iterations and less data across domains like probabilistic filtering, imitation learning, and reinforcement learning.

Recurrent neural networks (RNNs) are a vital modeling technique that rely on internal states learned indirectly by optimization of a supervised, unsupervised, or reinforcement training loss. RNNs are used to model dynamic processes that are characterized by underlying latent states whose form is often unknown, precluding its analytic representation inside an RNN. In the Predictive-State Representation (PSR) literature, latent state processes are modeled by an internal state representation that directly models the distribution of future observations, and most recent work in this area has relied on explicitly representing and targeting sufficient statistics of this probability distribution. We seek to combine the advantages of RNNs and PSRs by augmenting existing state-of-the-art recurrent neural networks with Predictive-State Decoders (PSDs), which add supervision to the network's internal state representation to target predicting future observations. Predictive-State Decoders are simple to implement and easily incorporated into existing training pipelines via additional loss regularization. We demonstrate the effectiveness of PSDs with experimental results in three different domains: probabilistic filtering, Imitation Learning, and Reinforcement Learning. In each, our method improves statistical performance of state-of-the-art recurrent baselines and does so with fewer iterations and less data.

View on arXiv PDF

Similar