Recursive Language Models
This addresses the context window limitation in LLMs for applications requiring long-text processing, representing a novel inference-time scaling approach rather than an incremental improvement.
The paper tackles the problem of processing arbitrarily long prompts with large language models by introducing Recursive Language Models (RLMs), an inference paradigm that allows models to recursively examine and decompose prompts, achieving up to two orders of magnitude longer inputs and outperforming frontier LLMs and long-context scaffolds across four tasks with comparable cost.
We study allowing large language models (LLMs) to process arbitrarily long prompts through the lens of inference-time scaling. We propose Recursive Language Models (RLMs), a general inference paradigm that treats long prompts as part of an external environment and allows the LLM to programmatically examine, decompose, and recursively call itself over snippets of the prompt. We find that RLMs can successfully process inputs up to two orders of magnitude beyond model context windows and, even for shorter prompts, dramatically outperform the quality of vanilla frontier LLMs and common long-context scaffolds across four diverse long-context tasks while having comparable cost. At a small scale, we post-train the first natively recursive language model. Our model, RLM-Qwen3-8B, outperforms the underlying Qwen3-8B model by $28.3\%$ on average and even approaches the quality of vanilla GPT-5 on three long-context tasks. Code is available at https://github.com/alexzhang13/rlm.