Learning Mixtures of Linear Classifiers
This addresses a discriminative learning problem for machine learning practitioners, offering a provably efficient alternative to existing methods like EM.
The paper tackles the problem of learning mixtures of linear classifiers without provable guarantees, developing a spectral method with a mirroring trick that achieves nearly optimal statistical efficiency under a probabilistic assumption on feature distribution.
We consider a discriminative learning (regression) problem, whereby the regression function is a convex combination of k linear classifiers. Existing approaches are based on the EM algorithm, or similar techniques, without provable guarantees. We develop a simple method based on spectral techniques and a `mirroring' trick, that discovers the subspace spanned by the classifiers' parameter vectors. Under a probabilistic assumption on the feature vector distribution, we prove that this approach has nearly optimal statistical efficiency.