CLMay 30

Learning to Retrieve: Dual-Level Long-Term Memory for Text-to-SQL Agents

Yibo Wang, Nikki Lijing Kuang, Philip S. Yu, Zhewei Yao, Yuxiong He

arXiv:2606.0054753.8h-index: 4

AI Analysis

For developers of interactive text-to-SQL agents, this work addresses the limitation of single-horizon memory retrieval by proposing a multi-horizon approach that improves experience reuse and task success.

MERIT introduces a dual-level memory retrieval framework for interactive text-to-SQL agents, using reinforcement learning to optimize both episode-level and turn-level memory selection. On BIRD-Interact, it outperforms baselines in success rate while reducing interaction turns, and shows positive cross-benchmark transfer on Spider2-Snow.

Interactive text-to-SQL agents solve database tasks through multi-turn interactions involving schema exploration, query execution, feedback interpretation, and decision revision. Long-term memory helps agents reuse past experiences, but existing retrieval methods remain limited. Static methods rely on fixed similarity heuristics that do not optimize downstream utility, while dynamic methods often learn from sparse final outcomes and retrieve memories at a single decision horizon. This is insufficient when memory usefulness changes across interaction stages, since memories useful for initial planning may differ from those needed for local, state-conditioned execution. We propose MERIT, a dynamic multi-horizon memory retrieval framework. MERIT maintains episode-level memory for global strategic guidance and turn-level memory for local decision support. Both levels use learned retrieval policies optimized with reinforcement learning. To train turn-level retrieval despite limited intermediate supervision, MERIT uses a lightweight Process Reward Model to provide dense proxy rewards for local memory selection. Experiments on BIRD-Interact show that MERIT outperforms no-memory, static-retrieval, and dynamic-retrieval baselines in success rate while reducing average interaction turns. Transfer results on Spider2-Snow further show positive cross-benchmark transfer without benchmark-specific tuning. These results suggest that multi-horizon retrieval improves experience reuse in interactive text-to-SQL agents.

View on arXiv PDF

Similar