CLFeb 26

Towards Faithful Industrial RAG: A Reinforced Co-adaptation Framework for Advertising QA

arXiv:2602.22584v11 citationsh-index: 9
Originality Highly original
AI Analysis

This work significantly improves the faithfulness and safety of RAG systems for industrial advertising QA, directly mitigating financial, compliance, and legal risks for businesses.

This paper addresses the problem of hallucinated content, especially fabricated URLs, in industrial advertising question answering (QA) systems. The authors developed a system that reduced the hallucination rate by 72% in offline tests and achieved a 92.7% reduction in URL hallucination during a two-week online A/B test, leading to a 28.6% increase in like rate and a 46.2% decrease in dislike rate.

Industrial advertising question answering (QA) is a high-stakes task in which hallucinated content, particularly fabricated URLs, can lead to financial loss, compliance violations, and legal risk. Although Retrieval-Augmented Generation (RAG) is widely adopted, deploying it in production remains challenging because industrial knowledge is inherently relational, frequently updated, and insufficiently aligned with generation objectives. We propose a reinforced co-adaptation framework that jointly optimizes retrieval and generation through two components: (1) Graph-aware Retrieval (GraphRAG), which models entity-relation structure over a high-citation knowledge subgraph for multi-hop, domain-specific evidence selection; and (2) evidence-constrained reinforcement learning via Group Relative Policy Optimization (GRPO) with multi-dimensional rewards covering faithfulness, style compliance, safety, and URL validity. Experiments on an internal advertising QA dataset show consistent gains across expert-judged dimensions including accuracy, completeness, and safety, while reducing the hallucination rate by 72\%. A two-week online A/B test demonstrates a 28.6\% increase in like rate, a 46.2\% decrease in dislike rate, and a 92.7\% reduction in URL hallucination. The system has been running in production for over half a year and has served millions of QA interactions.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes