LG AI CY MLMar 23, 2022

Is Fairness Only Metric Deep? Evaluating and Addressing Subgroup Gaps in Deep Metric Learning

Natalie Dullerud, Karsten Roth, Kimia Hamidieh, Nicolas Papernot, Marzyeh Ghassemi

arXiv:2203.12748v114.117 citationsh-index: 55

Originality Incremental advance

AI Analysis

This addresses fairness issues in DML for applications like retrieval and classification, but it is incremental as it builds on existing DML methods.

The paper tackles the problem of fairness in deep metric learning (DML) by evaluating state-of-the-art methods on imbalanced data, showing that they negatively impact minority subgroup performance in downstream tasks, with PARADE reducing subgroup gaps by up to 15% in accuracy.

Deep metric learning (DML) enables learning with less supervision through its emphasis on the similarity structure of representations. There has been much work on improving generalization of DML in settings like zero-shot retrieval, but little is known about its implications for fairness. In this paper, we are the first to evaluate state-of-the-art DML methods trained on imbalanced data, and to show the negative impact these representations have on minority subgroup performance when used for downstream tasks. In this work, we first define fairness in DML through an analysis of three properties of the representation space -- inter-class alignment, intra-class alignment, and uniformity -- and propose finDML, the fairness in non-balanced DML benchmark to characterize representation fairness. Utilizing finDML, we find bias in DML representations to propagate to common downstream classification tasks. Surprisingly, this bias is propagated even when training data in the downstream task is re-balanced. To address this problem, we present Partial Attribute De-correlation (PARADE) to de-correlate feature representations from sensitive attributes and reduce performance gaps between subgroups in both embedding space and downstream metrics.

View on arXiv PDF

Similar