Efficient Rejection Sampling in the Entropy-Optimal Range

arXiv:2504.0426711.64 citationsh-index: 3
Predicted impact top 21% in DS · last 90 daysOriginality Incremental advance
AI Analysis

For practitioners needing random variate generation from discrete distributions, this algorithm offers a practical solution that simultaneously optimizes space, time, and entropy, which was previously unattainable.

The paper introduces a new sampling algorithm for discrete distributions that achieves entropy-optimal expected coin flips (between H(P) and H(P)+2) while using linearithmic space and negligible runtime per sample, overcoming the space-time tradeoff of previous methods.

We study the problem of generating a random variate $X$ from a finite discrete probability distribution $P$ using an entropy source of independent fair coin flips. A classic result from Knuth and Yao shows that the optimal expected number of input coin flips per output sample lies between $H(P)$ and $H(P)\,{+}\,2$, where $H$ is the Shannon entropy function. However, implementing the Knuth and Yao ``entropy-optimal'' sampler entails a tradeoff between using either exponential space with low runtime per sample, or linear space with high runtime per sample. We introduce a new sampling algorithm that avoids this tradeoff: it requires linearithmic space, incurs negligible runtime overhead per sample, and uses an expected number of coin flips that lies in the entropy-optimal range $[H(P), H(P)\,{+}\,2)$. No previous sampler for discrete distributions simultaneously achieves these space, time, and entropy characteristics. Numerical experiments demonstrate improvements in runtime and entropy of the proposed method compared to the celebrated alias method.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes