Zero-Shot Retrieval with Search Agents and Hybrid Environments
This work addresses the challenge of improving retrieval performance in AI agents for information search, though it is incremental by building on existing learning-to-search methods.
The paper tackled the problem of building artificial agents that learn to autonomously search for information, extending previous setups to a hybrid environment with discrete query refinement after initial retrieval. The result showed that search agents trained via behavioral cloning outperformed the underlying system, matching state-of-the-art performance with balanced zero-shot and in-domain evaluations, at twice the speed.
Learning to search is the task of building artificial agents that learn to autonomously use a search box to find information. So far, it has been shown that current language models can learn symbolic query reformulation policies, in combination with traditional term-based retrieval, but fall short of outperforming neural retrievers. We extend the previous learning to search setup to a hybrid environment, which accepts discrete query refinement operations, after a first-pass retrieval step via a dual encoder. Experiments on the BEIR task show that search agents, trained via behavioral cloning, outperform the underlying search system based on a combined dual encoder retriever and cross encoder reranker. Furthermore, we find that simple heuristic Hybrid Retrieval Environments (HRE) can improve baseline performance by several nDCG points. The search agent based on HRE (HARE) matches state-of-the-art performance, balanced in both zero-shot and in-domain evaluations, via interpretable actions, and at twice the speed.