CL IRFeb 23

Controllable Evidence Selection in Retrieval-Augmented Question Answering via Deterministic Utility Gating

arXiv:2603.18011

AI Analysis

This addresses the issue of selecting redundant or incomplete evidence in AI question-answering systems, providing a more auditable and controlled approach, though it appears incremental as it builds on existing retrieval methods.

The paper tackles the problem of selecting usable evidence in retrieval-augmented question answering by introducing a deterministic framework with Meaning-Utility Estimation (MUE) and Diversity-Utility Estimation (DUE) to evaluate sentences for admissibility based on explicit signals like semantic relatedness and redundancy, without requiring training.

Many modern AI question-answering systems convert text into vectors and retrieve the closest matches to a user question. While effective for topical similarity, similarity scores alone do not explain why some retrieved text can serve as evidence while other equally similar text cannot. When many candidates receive similar scores, systems may select sentences that are redundant, incomplete, or address different conditions than the question requires. This paper presents a deterministic evidence selection framework for retrieval-augmented question answering. The approach introduces Meaning-Utility Estimation (MUE) and Diversity-Utility Estimation (DUE), fixed scoring and redundancy-control procedures that determine evidence admissibility prior to answer generation. Each sentence or record is evaluated independently using explicit signals for semantic relatedness, term coverage, conceptual distinctiveness, and redundancy. No training or fine-tuning is required. In the prototype, a unit is accepted only if it explicitly states the fact, rule, or condition required by the task. Units are not merged or expanded. If no unit independently satisfies the requirement, the system returns no answer. This deterministic gating produces compact, auditable evidence sets and establishes a clear boundary between relevant text and usable evidence.

View on arXiv PDF

Similar