LGAIJan 11, 2021

Learning to Ignore: Fair and Task Independent Representations

arXiv:2101.04047v2
AI Analysis

This work addresses the problem of learning fair and interpretable machine learning models for practitioners concerned with bias and domain shift, offering a unified framework for these related challenges.

This paper proposes a method to learn invariant representations by enforcing a common feature representation across subgroups, formulated as an additional loss. This approach is applied to learn fair models and interpret sensitive attribute influence, as well as for domain adaptation and knowledge transfer.

Training fair machine learning models, aiming for their interpretability and solving the problem of domain shift has gained a lot of interest in the last years. There is a vast amount of work addressing these topics, mostly in separation. In this work we show that they can be seen as a common framework of learning invariant representations. The representations should allow to predict the target while at the same time being invariant to sensitive attributes which split the dataset into subgroups. Our approach is based on the simple observation that it is impossible for any learning algorithm to differentiate samples if they have the same feature representation. This is formulated as an additional loss (regularizer) enforcing a common feature representation across subgroups. We apply it to learn fair models and interpret the influence of the sensitive attribute. Furthermore it can be used for domain adaptation, transferring knowledge and learning effectively from very few examples. In all applications it is essential not only to learn to predict the target, but also to learn what to ignore.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes