Class-Conditional Distribution Balancing for Group Robust Classification
This addresses robust generalization for machine learning models in real-world scenarios where spurious correlations are common, offering a practical solution without costly bias annotations, though it is incremental in improving existing approaches.
The paper tackles the problem of spurious correlations in classification by reframing it as class-conditional distribution imbalances and proposes a sample reweighting method to balance distributions without needing bias annotations or predictions, achieving state-of-the-art performance comparable to supervised methods.
Spurious correlations that lead models to correct predictions for the wrong reasons pose a critical challenge for robust real-world generalization. Existing research attributes this issue to group imbalance and addresses it by maximizing group-balanced or worst-group accuracy, which heavily relies on expensive bias annotations. A compromise approach involves predicting bias information using extensively pretrained foundation models, which requires large-scale data and becomes impractical for resource-limited rare domains. To address these challenges, we offer a novel perspective by reframing the spurious correlations as imbalances or mismatches in class-conditional distributions, and propose a simple yet effective robust learning method that eliminates the need for both bias annotations and predictions. With the goal of reducing the mutual information between spurious factors and label information, our method leverages a sample reweighting strategy to achieve class-conditional distribution balancing, which automatically highlights minority groups and classes, effectively dismantling spurious correlations and producing a debiased data distribution for classification. Extensive experiments and analysis demonstrate that our approach consistently delivers state-of-the-art performance, rivaling methods that rely on bias supervision.