LGCLMar 10, 2025

Fair Text Classification via Transferable Representations

arXiv:2503.07691v11 citationsh-index: 7
Originality Incremental advance
AI Analysis

This addresses fairness challenges in text classification for sensitive groups, representing an incremental improvement over existing methods.

The paper tackles group fairness in text classification by proposing an approach that uses Wasserstein Dependency Measure and adversarial training to induce independence between target label and sensitive attribute representations, and leverages Domain Adaptation to eliminate the need for sensitive attribute access. The method is supported by theoretical and empirical evidence.

Group fairness is a central research topic in text classification, where reaching fair treatment between sensitive groups (e.g., women and men) remains an open challenge. We propose an approach that extends the use of the Wasserstein Dependency Measure for learning unbiased neural text classifiers. Given the challenge of distinguishing fair from unfair information in a text encoder, we draw inspiration from adversarial training by inducing independence between representations learned for the target label and those for a sensitive attribute. We further show that Domain Adaptation can be efficiently leveraged to remove the need for access to the sensitive attributes in the dataset we cure. We provide both theoretical and empirical evidence that our approach is well-founded.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes