CVCLRODec 13, 2025

From Human Intention to Action Prediction: Intention-Driven End-to-End Autonomous Driving

arXiv:2512.12302v2
Originality Incremental advance
AI Analysis

This addresses the need for more intelligent autonomous agents that can understand abstract human goals, representing an incremental step by providing new benchmarks and evaluation methods.

The paper tackles the problem of autonomous driving systems being limited to simple navigational commands by introducing a benchmark for interpreting high-level human intentions, and it shows that existing models struggle with intention fulfillment while the proposed frameworks achieve superior alignment.

While end-to-end autonomous driving has achieved remarkable progress in geometric control, current systems remain constrained by a command-following paradigm that relies on simple navigational instructions. Transitioning to genuinely intelligent agents requires the capability to interpret and fulfill high-level, abstract human intentions. However, this advancement is hindered by the lack of dedicated benchmarks and semantic-aware evaluation metrics. In this paper, we formally define the task of Intention-Driven End-to-End Autonomous Driving and present Intention-Drive, a comprehensive benchmark designed to bridge this gap. We construct a large-scale dataset featuring complex natural language intentions paired with high-fidelity sensor data. To overcome the limitations of conventional trajectory-based metrics, we introduce the Imagined Future Alignment (IFA), a novel evaluation protocol leveraging generative world models to assess the semantic fulfillment of human goals beyond mere geometric accuracy. Furthermore, we explore the solution space by proposing two distinct paradigms: an end-to-end vision-language planner and a hierarchical agent-based framework. The experiments reveal a critical dichotomy where existing models exhibit satisfactory driving stability but struggle significantly with intention fulfillment. Notably, the proposed frameworks demonstrate superior alignment with human intentions.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes