Nested Browser-Use Learning for Agentic Information Seeking
This work addresses a bottleneck in agentic information seeking by allowing more effective access to rich web content, though it appears incremental in improving existing methods.
The paper tackles the problem of information-seeking agents being limited to API-level retrieval by proposing a nested browser-action framework that enables deeper web interaction, resulting in clear practical benefits on challenging benchmarks.
Information-seeking (IS) agents have achieved strong performance across a range of wide and deep search tasks, yet their tool use remains largely restricted to API-level snippet retrieval and URL-based page fetching, limiting access to the richer information available through real browsing. While full browser interaction could unlock deeper capabilities, its fine-grained control and verbose page content returns introduce substantial complexity for ReAct-style function-calling agents. To bridge this gap, we propose Nested Browser-Use Learning (NestBrowse), which introduces a minimal and complete browser-action framework that decouples interaction control from page exploration through a nested structure. This design simplifies agentic reasoning while enabling effective deep-web information acquisition. Empirical results on challenging deep IS benchmarks demonstrate that NestBrowse offers clear benefits in practice. Further in-depth analyses underscore its efficiency and flexibility.