IRLGDec 13, 2024

MST-R: Multi-Stage Tuning for Retrieval Systems and Metric Evaluation

arXiv:2412.10313v113 citationsh-index: 4Has Code
Originality Synthesis-oriented
AI Analysis

This work addresses domain-specific retrieval performance for regulatory NLP, but it is incremental as it builds on existing tuning and retrieval methods.

The paper tackles the challenge of adapting frozen retrieval-augmented generators to regulatory documents by proposing a multi-stage tuning strategy (MST-R) for retrieval systems, achieving top rank on the RegNLP challenge leaderboard and exposing metric gaming issues with a trivial answering approach.

Regulatory documents are rich in nuanced terminology and specialized semantics. FRAG systems: Frozen retrieval-augmented generators utilizing pre-trained (or, frozen) components face consequent challenges with both retriever and answering performance. We present a system that adapts the retriever performance to the target domain using a multi-stage tuning (MST) strategy. Our retrieval approach, called MST-R (a) first fine-tunes encoders used in vector stores using hard negative mining, (b) then uses a hybrid retriever, combining sparse and dense retrievers using reciprocal rank fusion, and then (c) adapts the cross-attention encoder by fine-tuning only the top-k retrieved results. We benchmark the system performance on the dataset released for the RIRAG challenge (as part of the RegNLP workshop at COLING 2025). We achieve significant performance gains obtaining a top rank on the RegNLP challenge leaderboard. We also show that a trivial answering approach games the RePASs metric outscoring all baselines and a pre-trained Llama model. Analyzing this anomaly, we present important takeaways for future research.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes