CLOct 31, 2022

Reduce Catastrophic Forgetting of Dense Retrieval Training with Teleportation Negatives

Tsinghua
arXiv:2210.17167v1293 citationsh-index: 26Has Code
Originality Highly original
AI Analysis

This addresses training instability for dense retrieval models, offering a competitive solution with reduced forgetting and faster convergence.

The paper tackles catastrophic forgetting in dense retrieval training by proposing ANCE-Tele, which uses teleportation negatives to smooth learning, resulting in outperforming previous state-of-the-art systems on web search and OpenQA and eliminating dependency on sparse retrieval negatives.

In this paper, we investigate the instability in the standard dense retrieval training, which iterates between model training and hard negative selection using the being-trained model. We show the catastrophic forgetting phenomena behind the training instability, where models learn and forget different negative groups during training iterations. We then propose ANCE-Tele, which accumulates momentum negatives from past iterations and approximates future iterations using lookahead negatives, as "teleportations" along the time axis to smooth the learning process. On web search and OpenQA, ANCE-Tele outperforms previous state-of-the-art systems of similar size, eliminates the dependency on sparse retrieval negatives, and is competitive among systems using significantly more (50x) parameters. Our analysis demonstrates that teleportation negatives reduce catastrophic forgetting and improve convergence speed for dense retrieval training. Our code is available at https://github.com/OpenMatch/ANCE-Tele.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes