LGMLOct 26, 2021

CARMS: Categorical-Antithetic-REINFORCE Multi-Sample Gradient Estimator

arXiv:2110.14002v18 citations
Originality Incremental advance
AI Analysis

This addresses a key bottleneck in training discrete latent variable models, offering an incremental improvement over existing gradient estimators.

The paper tackles the challenge of backpropagating gradients through categorical variables by proposing CARMS, an unbiased estimator that uses multiple antithetic samples to reduce variance. It outperforms competing methods on generative modeling and structured output prediction tasks.

Accurately backpropagating the gradient through categorical variables is a challenging task that arises in various domains, such as training discrete latent variable models. To this end, we propose CARMS, an unbiased estimator for categorical random variables based on multiple mutually negatively correlated (jointly antithetic) samples. CARMS combines REINFORCE with copula based sampling to avoid duplicate samples and reduce its variance, while keeping the estimator unbiased using importance sampling. It generalizes both the ARMS antithetic estimator for binary variables, which is CARMS for two categories, as well as LOORF/VarGrad, the leave-one-out REINFORCE estimator, which is CARMS with independent samples. We evaluate CARMS on several benchmark datasets on a generative modeling task, as well as a structured output prediction task, and find it to outperform competing methods including a strong self-control baseline. The code is publicly available.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes