LGOct 21, 2022

Just Mix Once: Worst-group Generalization by Group Interpolation

MIT
arXiv:2210.12195v13 citationsh-index: 10
Originality Incremental advance
AI Analysis

This addresses the issue of poor generalization to minority groups in machine learning models, offering a domain-agnostic and efficient solution that is incremental over existing self-supervised and oversampling approaches.

The paper tackles the problem of brittle models with poor performance under group distribution shifts by proposing Just Mix Once (JM1), a class-conditional mixup method for worst-group generalization, which performs on par or better than state-of-the-art methods without requiring full group annotation.

Advances in deep learning theory have revealed how average generalization relies on superficial patterns in data. The consequences are brittle models with poor performance with shift in group distribution at test time. When group annotation is available, we can use robust optimization tools to tackle the problem. However, identification and annotation are time-consuming, especially on large datasets. A recent line of work leverages self-supervision and oversampling to improve generalization on minority groups without group annotation. We propose to unify and generalize these approaches using a class-conditional variant of mixup tailored for worst-group generalization. Our approach, Just Mix Once (JM1), interpolates samples during learning, augmenting the training distribution with a continuous mixture of groups. JM1 is domain agnostic and computationally efficient, can be used with any level of group annotation, and performs on par or better than the state-of-the-art on worst-group generalization. Additionally, we provide a simple explanation of why JM1 works.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes