LGIRDec 15, 2025

Learning to Retrieve with Weakened Labels: Robust Training under Label Noise

arXiv:2512.13237v1
Originality Incremental advance
AI Analysis

This addresses the challenge of sparse annotation and label noise in training data for NLP retrieval tasks, offering a robust method that is incremental over existing loss-based or data-cleaning approaches.

The paper tackles the problem of training neural retrieval models under label noise by proposing a label weakening approach that uses sets of plausible labels derived from supervision and model confidence, and shows it improves performance compared to 10 state-of-the-art loss functions on four diverse ranking datasets.

Neural Encoders are frequently used in the NLP domain to perform dense retrieval tasks, for instance, to generate the candidate documents for a given query in question-answering tasks. However, sparse annotation and label noise in the training data make it challenging to train or fine-tune such retrieval models. Although existing works have attempted to mitigate these problems by incorporating modified loss functions or data cleaning, these approaches either require some hyperparameters to tune during training or add substantial complexity to the training setup. In this work, we consider a label weakening approach to generate robust retrieval models in the presence of label noise. Instead of enforcing a single, potentially erroneous label for each query document pair, we allow for a set of plausible labels derived from both the observed supervision and the model's confidence scores. We perform an extensive evaluation considering two retrieval models, one re-ranking model, considering four diverse ranking datasets. To this end, we also consider a realistic noisy setting by using a semantic-aware noise generation technique to generate different ratios of noise. Our initial results show that label weakening can improve the performance of the retrieval tasks in comparison to 10 different state-of-the-art loss functions.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes