MLLGSep 29, 2018

Improved Gradient-Based Optimization Over Discrete Distributions

arXiv:1810.00116v39 citations
Originality Incremental advance
AI Analysis

This work addresses a challenge in machine learning for researchers and practitioners dealing with discrete optimization, though it appears incremental as it builds on existing estimators.

The paper tackled the problem of gradient estimation for optimizing expectations over discrete distributions, showing that the Gumbel-Softmax estimator is biased and proposing methods to reduce bias, which led to improved performance in variational inference and binary optimization tasks.

In many applications we seek to maximize an expectation with respect to a distribution over discrete variables. Estimating gradients of such objectives with respect to the distribution parameters is a challenging problem. We analyze existing solutions including finite-difference (FD) estimators and continuous relaxation (CR) estimators in terms of bias and variance. We show that the commonly used Gumbel-Softmax estimator is biased and propose a simple method to reduce it. We also derive a simpler piece-wise linear continuous relaxation that also possesses reduced bias. We demonstrate empirically that reduced bias leads to a better performance in variational inference and on binary optimization tasks.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes