LGMLDec 18, 2019

Distributional Reinforcement Learning for Energy-Based Sequential Models

arXiv:1912.08517v123 citations
Originality Incremental advance
AI Analysis

This work addresses a bottleneck in energy-based sequential models for researchers in machine learning, offering a more general solution than prior incremental distillation methods.

The paper tackles the problem of sampling from unnormalized energy-based sequential models, which have high representational power but cannot be directly exploited for sampling, by proposing a distributional reinforcement learning approach that generalizes previous distillation techniques. The result is a method applicable to any sequential EBM, with effectiveness demonstrated on GAM-based experiments.

Global Autoregressive Models (GAMs) are a recent proposal [Parshakova et al., CoNLL 2019] for exploiting global properties of sequences for data-efficient learning of seq2seq models. In the first phase of training, an Energy-Based model (EBM) over sequences is derived. This EBM has high representational power, but is unnormalized and cannot be directly exploited for sampling. To address this issue [Parshakova et al., CoNLL 2019] proposes a distillation technique, which can only be applied under limited conditions. By relating this problem to Policy Gradient techniques in RL, but in a \emph{distributional} rather than \emph{optimization} perspective, we propose a general approach applicable to any sequential EBM. Its effectiveness is illustrated on GAM-based experiments.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes