IR CLApr 28

Agentic Search in the Wild: Intents and Trajectory Dynamics from 14M+ Real Search Requests

Jingjie Ning, João Coelho, Yibo Kong, Yunfan Long, Bruno Martins, João Magalhães, Jamie Callan, Chenyan Xiong

arXiv:2601.1761783.4h-index: 5Has Code

Predicted impact top 14% in IR · last 90 daysOriginality Synthesis-oriented

AI Analysis

For the IR community, this provides the first large-scale empirical characterization of agentic search behavior, offering signals for improving search agents.

This paper analyzes 14.44M search requests from DeepResearchGym to characterize agentic search sessions, finding that over 90% of multi-turn sessions have at most ten steps, 89% of inter-step intervals are under one minute, and 54% of new query terms are traceable to previously retrieved evidence.

LLM-powered search agents are increasingly being used for multi-step information seeking tasks, yet the IR community lacks empirical understanding of how agentic search sessions unfold and how retrieved evidence is reflected in later queries. This paper presents a large-scale log analysis of agentic search based on 14.44M search requests (3.97M sessions) collected from DeepResearchGym, i.e., an open-source search API accessed by external agentic clients. We sessionize the logs, assign session-level intents and step-wise query-reformulation labels using LLM-based annotation, and propose Context-driven Term Adoption Rate (CTAR) to quantify whether newly introduced query terms are lexically traceable to previously retrieved evidence. Our analyses reveal distinctive behavioral patterns. First, over 90\% of multi-turn sessions contain at most ten steps, and 89\% of inter-step intervals fall under one minute. Second, behavior varies by intent. Fact-seeking sessions exhibit high repetition that increases over time, while sessions requiring reasoning sustain broader exploration. Third, query reformulations are often traceable to retrieved evidence across steps. On average, 54\% of newly introduced query terms appear in the accumulated evidence context, with additional traceability to earlier steps beyond the most recent retrieval. These findings provide candidate signals for repetition-aware stopping, intent-adaptive retrieval budgeting, and explicit cross-step context tracking. We released the anonymized logs, making them available at a public HuggingFace~\chref{https://huggingface.co/datasets/cx-cmu/deepresearchgym-agentic-search-logs}{repository}.

View on arXiv PDF

Similar