ITDSLGNov 18, 2019

Estimating Entropy of Distributions in Constant Space

arXiv:1911.07976v113 citations
Originality Incremental advance
AI Analysis

This addresses the challenge of entropy estimation in memory-constrained streaming settings, which is incremental as it builds on prior work but introduces a constant-space method.

The paper tackles the problem of estimating the entropy of k-ary distributions from samples in the streaming model with limited space, achieving an algorithm that uses constant space and outputs a ±ε estimate with O(k log(1/ε)²/ε³) samples.

We consider the task of estimating the entropy of $k$-ary distributions from samples in the streaming model, where space is limited. Our main contribution is an algorithm that requires $O\left(\frac{k \log (1/\varepsilon)^2}{\varepsilon^3}\right)$ samples and a constant $O(1)$ memory words of space and outputs a $\pm\varepsilon$ estimate of $H(p)$. Without space limitations, the sample complexity has been established as $S(k,\varepsilon)=Θ\left(\frac k{\varepsilon\log k}+\frac{\log^2 k}{\varepsilon^2}\right)$, which is sub-linear in the domain size $k$, and the current algorithms that achieve optimal sample complexity also require nearly-linear space in $k$. Our algorithm partitions $[0,1]$ into intervals and estimates the entropy contribution of probability values in each interval. The intervals are designed to trade off the bias and variance of these estimates.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes