CLMay 7, 2022

Learning Disentangled Textual Representations via Statistical Measures of Similarity

Pierre Colombo, Guillaume Staerman, Nathan Noiry, Pablo Piantanida

arXiv:2205.03589v232.1647 citationsh-index: 28

Originality Incremental advance

AI Analysis

This work addresses the challenge of reducing bias in text classification by sensitive attributes, offering a more efficient alternative to existing methods, though it is incremental in nature.

The authors tackled the problem of learning disentangled textual representations for fair classification by introducing a family of regularizers based on statistical similarity measures, which achieved better results without requiring additional training or tuning.

When working with textual data, a natural application of disentangled representations is fair classification where the goal is to make predictions without being biased (or influenced) by sensitive attributes that may be present in the data (e.g., age, gender or race). Dominant approaches to disentangle a sensitive attribute from textual representations rely on learning simultaneously a penalization term that involves either an adversarial loss (e.g., a discriminator) or an information measure (e.g., mutual information). However, these methods require the training of a deep neural network with several parameter updates for each update of the representation model. As a matter of fact, the resulting nested optimization loop is both time consuming, adding complexity to the optimization dynamic, and requires a fine hyperparameter selection (e.g., learning rates, architecture). In this work, we introduce a family of regularizers for learning disentangled representations that do not require training. These regularizers are based on statistical measures of similarity between the conditional probability distributions with respect to the sensitive attributes. Our novel regularizers do not require additional training, are faster and do not involve additional tuning while achieving better results both when combined with pretrained and randomly initialized text encoders.

View on arXiv PDF

Similar