LGCVMay 12

Bias In, Bias Out? Finding Unbiased Subnetworks in Vanilla Models

arXiv:2603.0558233.8h-index: 13
AI Analysis

For practitioners needing efficient debiasing, BISE offers a lightweight alternative to data-centric or retraining-based methods, though it is incremental as it applies existing pruning techniques to a new problem.

BISE extracts bias-free subnetworks from vanilla-trained models via pruning, achieving competitive debiasing without retraining or additional data. On BiasedMNIST, it reduces bias reliance while maintaining accuracy within 1% of the original model.

The issue of algorithmic biases in deep learning has led to the development of various debiasing techniques, many of which perform complex training procedures or dataset manipulation. However, an intriguing question arises: is it possible to extract fair and bias-agnostic subnetworks from standard vanilla-trained models without relying on additional data, such as unbiased training set? In this work, we introduce Bias-Invariant Subnetwork Extraction (BISE), a learning strategy that identifies and isolates "bias-free" subnetworks that already exist within conventionally trained models, without retraining or finetuning the original parameters. Our approach demonstrates that such subnetworks can be extracted via pruning and can operate without modification, effectively relying less on biased features and maintaining robust performance. Our findings contribute towards efficient bias mitigation through structural adaptation of pre-trained neural networks via parameter removal, as opposed to costly strategies that are either data-centric or involve (re)training all model parameters. Extensive experiments on common benchmarks show the advantages of our approach in terms of the performance and computational efficiency of the resulting debiased model.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes