DS DM LG PR STApr 26, 2020

Learning and Testing Junta Distributions with Subcube Conditioning

Xi Chen, Rajesh Jayaram, Amit Levi, Erik Waingarten

arXiv:2004.12496v110.332 citations

Originality Incremental advance

AI Analysis

This addresses an open question in distribution testing, offering significant efficiency improvements for high-dimensional data analysis, though it is incremental in advancing subcube conditioning methods.

The paper tackles the problem of learning and testing junta distributions on high-dimensional binary spaces, achieving algorithms with query complexities of Õ(k/ε²) log n + O(2^k/ε²) for learning and Õ((k + √n)/ε²) for testing, which are optimal up to poly-logarithmic factors.

We study the problems of learning and testing junta distributions on $\{-1,1\}^n$ with respect to the uniform distribution, where a distribution $p$ is a $k$-junta if its probability mass function $p(x)$ depends on a subset of at most $k$ variables. The main contribution is an algorithm for finding relevant coordinates in a $k$-junta distribution with subcube conditioning [BC18, CCKLW20]. We give two applications: 1. An algorithm for learning $k$-junta distributions with $\tilde{O}(k/ε^2) \log n + O(2^k/ε^2)$ subcube conditioning queries, and 2. An algorithm for testing $k$-junta distributions with $\tilde{O}((k + \sqrt{n})/ε^2)$ subcube conditioning queries. All our algorithms are optimal up to poly-logarithmic factors. Our results show that subcube conditioning, as a natural model for accessing high-dimensional distributions, enables significant savings in learning and testing junta distributions compared to the standard sampling model. This addresses an open question posed by Aliakbarpour, Blais, and Rubinfeld [ABR17].

View on arXiv PDF

Similar