FairNVT: Improving Fairness via Noise Injection in Vision Transformers
For practitioners using pretrained vision transformers, FairNVT offers a practical method to mitigate bias without sacrificing performance, though it is an incremental improvement over existing debiasing techniques.
FairNVT introduces a lightweight debiasing framework for pretrained transformer encoders that reduces sensitive-attribute leakage and improves fairness metrics (e.g., demographic parity, equalized odds) while preserving task accuracy across vision and language datasets.
This paper presents FairNVT, a lightweight debiasing framework for pretrained transformer-based encoders that improves both representation and prediction level fairness while preserving task accuracy. Unlike many existing debiasing approaches that address these notions separately, we argue they are inherently connected: suppressing sensitive information at the representation level can facilitate fairer predictions. Our approach learns task-relevant and sensitive embeddings via lightweight adapters, applies calibrated Gaussian noise to the sensitive embedding, and fuses it with the task representation. Together with orthogonality constraints and fairness regularization, these components jointly reduce sensitive-attribute leakage in the learned embeddings and encourage fairer downstream predictions. The framework is compatible with a wide range of pretrained transformer encoders. Across three datasets spanning vision and language, FairNVT reduces sensitive-attribute attacker accuracy, improves demographic-parity and equalized-odds metrics, and maintains high task performance.