LGAINov 27, 2024

Metric-DST: Mitigating Selection Bias Through Diversity-Guided Semi-Supervised Metric Learning

arXiv:2411.18442v21 citationsh-index: 4
Originality Incremental advance
AI Analysis

This addresses fairness issues in ML by mitigating selection bias, offering a flexible solution, though it appears incremental as it builds on self-training with a novel diversity focus.

The paper tackled selection bias in machine learning by proposing Metric-DST, a diversity-guided semi-supervised method that uses metric learning to include diverse samples, resulting in more robust models on generated, real-world, and molecular biology datasets with induced or intrinsic bias.

Selection bias poses a critical challenge for fairness in machine learning, as models trained on data that is less representative of the population might exhibit undesirable behavior for underrepresented profiles. Semi-supervised learning strategies like self-training can mitigate selection bias by incorporating unlabeled data into model training to gain further insight into the distribution of the population. However, conventional self-training seeks to include high-confidence data samples, which may reinforce existing model bias and compromise effectiveness. We propose Metric-DST, a diversity-guided self-training strategy that leverages metric learning and its implicit embedding space to counter confidence-based bias through the inclusion of more diverse samples. Metric-DST learned more robust models in the presence of selection bias for generated and real-world datasets with induced bias, as well as a molecular biology prediction task with intrinsic bias. The Metric-DST learning strategy offers a flexible and widely applicable solution to mitigate selection bias and enhance fairness of machine learning models.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes