DS IT LGMay 19, 2022

Estimation of Entropy in Constant Space with Improved Sample Complexity

Maryam Aliakbarpour, Andrew McGregor, Jelani Nelson, Erik Waingarten

arXiv:2205.09804v14.37 citationsh-index: 40

Originality Incremental advance

AI Analysis

This incremental improvement addresses memory-efficient streaming algorithms for entropy estimation, relevant for data analysis and machine learning.

The paper tackles the problem of estimating entropy of a distribution with constant memory, reducing sample complexity from (k/ε^3)·polylog(1/ε) to (k/ε^2)·polylog(1/ε).

Recent work of Acharya et al. (NeurIPS 2019) showed how to estimate the entropy of a distribution $\mathcal D$ over an alphabet of size $k$ up to $\pmε$ additive error by streaming over $(k/ε^3) \cdot \text{polylog}(1/ε)$ i.i.d. samples and using only $O(1)$ words of memory. In this work, we give a new constant memory scheme that reduces the sample complexity to $(k/ε^2)\cdot \text{polylog}(1/ε)$. We conjecture that this is optimal up to $\text{polylog}(1/ε)$ factors.

View on arXiv PDF

Similar