Not Only the Last-Layer Features for Spurious Correlations: All Layer Deep Feature Reweighting
This work addresses spurious correlations for improving group-level fairness in machine learning models, representing an incremental advancement over existing last-layer reweighting methods.
The paper tackles the problem of spurious correlations in machine learning by proposing a method to retrain classifiers using features from all neural network layers, rather than just the last layer, to preserve key attributes that might be discarded. This approach achieved significant improvements in worst-group accuracy on standard benchmarks, though specific numbers are not provided in the abstract.
Spurious correlations are a major source of errors for machine learning models, in particular when aiming for group-level fairness. It has been recently shown that a powerful approach to combat spurious correlations is to re-train the last layer on a balanced validation dataset, isolating robust features for the predictor. However, key attributes can sometimes be discarded by neural networks towards the last layer. In this work, we thus consider retraining a classifier on a set of features derived from all layers. We utilize a recently proposed feature selection strategy to select unbiased features from all the layers. We observe this approach gives significant improvements in worst-group accuracy on several standard benchmarks.