LG DS SP MLDec 29, 2020

Source Identification for Mixtures of Product Distributions

Spencer L. Gordon, Bijan Mazaheri, Yuval Rabani, Leonard J. Schulman

arXiv:2012.14540v18.524 citations

Originality Incremental advance

AI Analysis

This work addresses the fundamental problem of source identification in machine learning, offering a computationally bounded solution for researchers and practitioners working with mixtures of product distributions.

This paper presents an algorithm for identifying the source parameters of a mixture of k product distributions on n bits. The algorithm uses approximate multilinear moments as input and operates in $2^{O(k^2)} n^{O(k)}$ arithmetic operations, providing the first explicit bound on the computational complexity for this problem.

We give an algorithm for source identification of a mixture of $k$ product distributions on $n$ bits. This is a fundamental problem in machine learning with many applications. Our algorithm identifies the source parameters of an identifiable mixture, given, as input, approximate values of multilinear moments (derived, for instance, from a sufficiently large sample), using $2^{O(k^2)} n^{O(k)}$ arithmetic operations. Our result is the first explicit bound on the computational complexity of source identification of such mixtures. The running time improves previous results by Feldman, O'Donnell, and Servedio (FOCS 2005) and Chen and Moitra (STOC 2019) that guaranteed only learning the mixture (without parametric identification of the source). Our analysis gives a quantitative version of a qualitative characterization of identifiable sources that is due to Tahmasebi, Motahari, and Maddah-Ali (ISIT 2018).

View on arXiv PDF

Similar