A Convex Program for Mixed Linear Regression with a Recovery Guarantee for Well-Separated Data
This provides a theoretical guarantee for mixed linear regression, a domain-specific problem in statistics and machine learning, but it is incremental as it builds on existing convex optimization methods.
The authors tackled the problem of mixed linear regression by introducing a convex second-order cone program based on L1 minimization, which exactly recovers all mixture components in noiseless settings under well-separation assumptions, requiring at least d independent measurements per class.
We introduce a convex approach for mixed linear regression over $d$ features. This approach is a second-order cone program, based on L1 minimization, which assigns an estimate regression coefficient in $\mathbb{R}^{d}$ for each data point. These estimates can then be clustered using, for example, $k$-means. For problems with two or more mixture classes, we prove that the convex program exactly recovers all of the mixture components in the noiseless setting under technical conditions that include a well-separation assumption on the data. Under these assumptions, recovery is possible if each class has at least $d$ independent measurements. We also explore an iteratively reweighted least squares implementation of this method on real and synthetic data.