Towards Agents That Know When They Don't Know: Uncertainty as a Control Signal for Structured Reasoning
This addresses the issue of unreliable LLM agents in complex multi-table biomedical data, offering a method to enhance factuality and calibration for improved decision-making in healthcare applications.
The paper tackles the problem of LLM agents producing overconfident outputs in structured biomedical data environments by introducing an uncertainty-aware agent that uses retrieval and summary uncertainty signals. The approach nearly triples correct and useful claims per summary (from 3.0 to 8.4 and 3.6 to 9.9) and improves downstream survival prediction (C-index from 0.32 to 0.63).
Large language model (LLM) agents are increasingly deployed in structured biomedical data environments, yet they often produce fluent but overconfident outputs when reasoning over complex multi-table data. We introduce an uncertainty-aware agent for query-conditioned multi-table summarization that leverages two complementary signals: (i) retrieval uncertainty--entropy over multiple table-selection rollouts--and (ii) summary uncertainty--combining self-consistency and perplexity. Summary uncertainty is incorporated into reinforcement learning (RL) with Group Relative Policy Optimization (GRPO), while both retrieval and summary uncertainty guide inference-time filtering and support the construction of higher-quality synthetic datasets. On multi-omics benchmarks, our approach improves factuality and calibration, nearly tripling correct and useful claims per summary (3.0\(\rightarrow\)8.4 internal; 3.6\(\rightarrow\)9.9 cancer multi-omics) and substantially improving downstream survival prediction (C-index 0.32\(\rightarrow\)0.63). These results demonstrate that uncertainty can serve as a control signal--enabling agents to abstain, communicate confidence, and become more reliable tools for complex structured-data environments.