SearchGym: A Modular Infrastructure for Cross-Platform Benchmarking and Hybrid Search Orchestration
This addresses the problem of robust system development for researchers and engineers in information retrieval, though it appears incremental as it builds on existing RAG toolkits with a focus on modularity and benchmarking.
The paper tackles the gap between experimental prototypes and production-ready systems in Retrieval-Augmented Generation (RAG) by introducing SearchGym, a modular infrastructure for cross-platform benchmarking and hybrid search orchestration, achieving a 70% Top-100 retrieval rate on the LitSearch benchmark.
The rapid growth of Retrieval-Augmented Generation (RAG) has created a proliferation of toolkits, yet a fundamental gap remains between experimental prototypes and robust, production-ready systems. We present SearchGym, a modular infrastructure designed for cross-platform benchmarking and hybrid search orchestration. Unlike existing model-centric frameworks, SearchGym decouples data representation, embedding strategies, and retrieval logic into stateful abstractions: Dataset, VectorSet, and App. This separation enables a Compositional Config Algebra, allowing designers to synthesize entire systems from hierarchical configurations while ensuring perfect reproducibility. Moreover, we analyze the "Top-$k$ Cognizance" in hybrid retrieval pipelines, demonstrating that the optimal sequence of semantic ranking and structured filtering is highly dependent on filter strength. Evaluated on the LitSearch expert-annotated benchmark, SearchGym achieves a 70% Top-100 retrieval rate. SearchGym reveals a design tension between generalizability and optimizability, presenting the potential where engineering optimization may serve as a tool for uncovering the causal mechanisms inherent in information retrieval across heterogeneous domains. An open-source implementation of SearchGym is available at: https://github.com/JeromeTH/search-gym