Dynamic Context Tuning for Retrieval-Augmented Generation: Enhancing Multi-Turn Planning and Tool Adaptation
This work addresses the need for scalable and adaptable AI assistants in dynamic environments where user intent and tools evolve, representing a novel method for a known bottleneck rather than a foundational breakthrough.
The paper tackles the problem of static, single-turn retrieval-augmented generation (RAG) systems being unsuitable for dynamic domains like healthcare and smart homes, and presents Dynamic Context Tuning (DCT) to enhance multi-turn planning and tool adaptation, resulting in a 14% improvement in plan accuracy and a 37% reduction in hallucinations while matching GPT-4 performance at lower cost.
Retrieval-Augmented Generation (RAG) has significantly advanced large language models (LLMs) by grounding their outputs in external tools and knowledge sources. However, existing RAG systems are typically constrained to static, single-turn interactions with fixed toolsets, making them ill-suited for dynamic domains such as healthcare and smart homes, where user intent, available tools, and contextual factors evolve over time. We present Dynamic Context Tuning (DCT), a lightweight framework that extends RAG to support multi-turn dialogue and evolving tool environments without requiring retraining. DCT integrates an attention-based context cache to track relevant past information, LoRA-based retrieval to dynamically select domain-specific tools, and efficient context compression to maintain inputs within LLM context limits. Experiments on both synthetic and real-world benchmarks show that DCT improves plan accuracy by 14% and reduces hallucinations by 37%, while matching GPT-4 performance at significantly lower cost. Furthermore, DCT generalizes to previously unseen tools, enabling scalable and adaptable AI assistants across a wide range of dynamic environments.