MLLGOCMEJan 29, 2023

Imbalanced Mixed Linear Regression

arXiv:2301.12559v15 citationsh-index: 43
Originality Incremental advance
AI Analysis

This addresses a practical challenge in mixed linear regression for applications with imbalanced data, offering an incremental improvement over existing methods.

The paper tackles the problem of mixed linear regression with imbalanced component proportions, where existing methods often fail, and proposes Mix-IRLS, a sequential algorithm that performs well in imbalanced settings and outperforms other methods on real-world datasets, sometimes by a large margin.

We consider the problem of mixed linear regression (MLR), where each observed sample belongs to one of $K$ unknown linear models. In practical applications, the proportions of the $K$ components are often imbalanced. Unfortunately, most MLR methods do not perform well in such settings. Motivated by this practical challenge, in this work we propose Mix-IRLS, a novel, simple and fast algorithm for MLR with excellent performance on both balanced and imbalanced mixtures. In contrast to popular approaches that recover the $K$ models simultaneously, Mix-IRLS does it sequentially using tools from robust regression. Empirically, Mix-IRLS succeeds in a broad range of settings where other methods fail. These include imbalanced mixtures, small sample sizes, presence of outliers, and an unknown number of models $K$. In addition, Mix-IRLS outperforms competing methods on several real-world datasets, in some cases by a large margin. We complement our empirical results by deriving a recovery guarantee for Mix-IRLS, which highlights its advantage on imbalanced mixtures.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes