ATLAS-RTC: Closing the Loop on LLM Agent Output with Token-Level Runtime Control
For developers of LLM-based systems requiring reliable structured output, ATLAS-RTC offers a practical runtime control layer that corrects decoding errors before they occur.
ATLAS-RTC introduces a runtime control system for LLMs that enforces structured output during decoding, improving first-attempt success rates by 20–37.8 percentage points and reducing latency by up to 88% in failure-dominated settings.
We present ATLAS-RTC, a runtime control system for autoregressive language models that enforces structured output during decoding. ATLAS-RTC monitors generation at each step, detects drift from output contracts using lightweight signals, and applies targeted interventions such as biasing, masking, and rollback. Unlike post-hoc validation or static constrained decoding, it operates in a closed loop, enabling correction before errors materialize. Across structured generation and tool-calling tasks, ATLAS-RTC improves first-attempt success rates by 20 to 37.8 percentage points, with up to 88% latency reduction in failure-dominated settings. Results show that many failures arise from decoding artifacts rather than task misunderstanding, motivating runtime control as a distinct layer in LLM systems.