LGDSSTOct 24, 2022

Learning and Covering Sums of Independent Random Variables with Unbounded Support

arXiv:2210.13313v12 citationsh-index: 17
Originality Highly original
AI Analysis

This addresses theoretical challenges in probability and learning theory for handling heavy-tailed distributions, with incremental improvements over prior lower bounds.

The paper tackles the problem of learning sums of independent integer-valued random variables with unbounded support, showing that under certain conditions, sample complexity can be made independent of the number of variables and support size, achieving poly(1/ε) complexity, and it provides algorithms for proper sparse covers in total variation distance with Õ(k)·poly(1/ε) samples for exponential families.

We study the problem of covering and learning sums $X = X_1 + \cdots + X_n$ of independent integer-valued random variables $X_i$ (SIIRVs) with unbounded, or even infinite, support. De et al. at FOCS 2018, showed that the maximum value of the collective support of $X_i$'s necessarily appears in the sample complexity of learning $X$. In this work, we address two questions: (i) Are there general families of SIIRVs with unbounded support that can be learned with sample complexity independent of both $n$ and the maximal element of the support? (ii) Are there general families of SIIRVs with unbounded support that admit proper sparse covers in total variation distance? As for question (i), we provide a set of simple conditions that allow the unbounded SIIRV to be learned with complexity $\text{poly}(1/ε)$ bypassing the aforementioned lower bound. We further address question (ii) in the general setting where each variable $X_i$ has unimodal probability mass function and is a different member of some, possibly multi-parameter, exponential family $\mathcal{E}$ that satisfies some structural properties. These properties allow $\mathcal{E}$ to contain heavy tailed and non log-concave distributions. Moreover, we show that for every $ε> 0$, and every $k$-parameter family $\mathcal{E}$ that satisfies some structural assumptions, there exists an algorithm with $\tilde{O}(k) \cdot \text{poly}(1/ε)$ samples that learns a sum of $n$ arbitrary members of $\mathcal{E}$ within $ε$ in TV distance. The output of the learning algorithm is also a sum of random variables whose distribution lies in the family $\mathcal{E}$. En route, we prove that any discrete unimodal exponential family with bounded constant-degree central moments can be approximated by the family corresponding to a bounded subset of the initial (unbounded) parameter space.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes