AINov 6, 2025

DMA: Online RAG Alignment with Human Feedback

arXiv:2511.04880v1h-index: 6
Originality Incremental advance
AI Analysis

This addresses the issue of adapting RAG systems to evolving intent and content drift for users in interactive settings, though it appears incremental as it builds on existing feedback and ranking methods.

The paper tackles the problem of static retrieval in RAG systems by introducing DMA, an online learning framework that incorporates human feedback to align ranking, resulting in substantial improvements in human engagement in deployment and notable gains on conversational QA benchmarks like TriviaQA and HotpotQA.

Retrieval-augmented generation (RAG) systems often rely on static retrieval, limiting adaptation to evolving intent and content drift. We introduce Dynamic Memory Alignment (DMA), an online learning framework that systematically incorporates multi-granularity human feedback to align ranking in interactive settings. DMA organizes document-, list-, and response-level signals into a coherent learning pipeline: supervised training for pointwise and listwise rankers, policy optimization driven by response-level preferences, and knowledge distillation into a lightweight scorer for low-latency serving. Throughout this paper, memory refers to the model's working memory, which is the entire context visible to the LLM for In-Context Learning. We adopt a dual-track evaluation protocol mirroring deployment: (i) large-scale online A/B ablations to isolate the utility of each feedback source, and (ii) few-shot offline tests on knowledge-intensive benchmarks. Online, a multi-month industrial deployment further shows substantial improvements in human engagement. Offline, DMA preserves competitive foundational retrieval while yielding notable gains on conversational QA (TriviaQA, HotpotQA). Taken together, these results position DMA as a principled approach to feedback-driven, real-time adaptation in RAG without sacrificing baseline capability.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes