Beyond the Click: A Framework for Inferring Cognitive Traces in Search
For researchers building user simulators for search evaluation, this work provides a method to infer cognitive traces from behavioral logs, though improvements are limited to datasets where behavioral signals are weak.
The paper introduces a framework to infer cognitive states (e.g., confusion, satisfaction) from search behavior logs using a multi-agent LLM system grounded in Information Foraging Theory. On MovieLens, the cognitive model improves F1 by up to 6.6% over behavioral baselines, but gains are near zero on AOL where clicks are already predictive.
User simulators are essential for evaluating search systems, but they primarily reproduce user actions without modeling the underlying thought process. Large-scale interaction logs record what users do, but not what they might be thinking or feeling, such as confusion or satisfaction. We present a framework for inferring cognitive traces from behavioral logs. Our method uses a multi-agent LLM system grounded in Information Foraging Theory (IFT) and validated by human experts. We annotate three public datasets (AOL, Stack Overflow, and MovieLens), producing over 530,000 cognitive labels across 50,000 sessions. A cross-dataset evaluation with a shuffled-label control reveals that cognitive labels provide the strongest signal where behavioral features are weakest: on MovieLens, the cognitive model improves F1 by up to 6.6% over the behavioral baseline and 1.8% above the shuffled control, while on AOL, where click patterns are highly predictive, improvements are near zero. We release the annotation collection on HuggingFace, an open-source annotation tool, and all experimental code to support future work on cognitively aware user simulation.