AIOct 10, 2025

DualResearch: Entropy-Gated Dual-Graph Retrieval for Answer Reconstruction

Jinxin Shi, Zongsheng Cao, Runmin Ma, Yusong Hu, Jie Zhou, Xin Li, Lei Bai, Liang He, Bo Zhang

arXiv:2510.08959v15.81 citationsh-index: 3Has Code

Originality Incremental advance

AI Analysis

This addresses reliability issues in tool-intensive scientific reasoning systems, though it appears incremental as a complement to existing deep-research frameworks.

The paper tackles the problem of context pollution, weak evidentiary support, and brittle execution paths in deep-research frameworks for scientific reasoning by proposing DualResearch, a retrieval and fusion framework that jointly models breadth semantic and depth causal graphs with entropy-gated fusion. On benchmarks HLE and GPQA using InternAgent logs, it improves accuracy by 7.7% and 6.06%, respectively.

The deep-research framework orchestrates external tools to perform complex, multi-step scientific reasoning that exceeds the native limits of a single large language model. However, it still suffers from context pollution, weak evidentiary support, and brittle execution paths. To address these issues, we propose DualResearch, a retrieval and fusion framework that matches the epistemic structure of tool-intensive reasoning by jointly modeling two complementary graphs: a breadth semantic graph that encodes stable background knowledge, and a depth causal graph that captures execution provenance. Each graph has a layer-native relevance function, seed-anchored semantic diffusion for breadth, and causal-semantic path matching with reliability weighting for depth. To reconcile their heterogeneity and query-dependent uncertainty, DualResearch converts per-layer path evidence into answer distributions and fuses them in log space via an entropy-gated rule with global calibration. The fusion up-weights the more certain channel and amplifies agreement. As a complement to deep-research systems, DualResearch compresses lengthy multi-tool execution logs into a concise reasoning graph, and we show that it can reconstruct answers stably and effectively. On the scientific reasoning benchmarks HLE and GPQA, DualResearch achieves competitive performance. Using log files from the open-source system InternAgent, its accuracy improves by 7.7% on HLE and 6.06% on GPQA.

View on arXiv PDF

Similar