ROMar 11

KnowDiffuser: A Knowledge-Guided Diffusion Planner with LM Reasoning and Prior-Informed Trajectory Initialization

Fan Ding, Xuewen Luo, Fengze Yang, Bo Yu, HwaHui Tew, Ganesh Krishnasamy, Junn Yong Loo

arXiv:2603.10441v110.2h-index: 9

Predicted impact top 39% in RO · last 90 daysOriginality Highly original

AI Analysis

This work addresses the semantic-to-physical gap in autonomous driving systems, providing a robust and interpretable framework for motion planning.

The paper tackled the problem of generating continuous, physically feasible trajectories for autonomous driving by integrating language models' semantic reasoning with diffusion models' generative power, resulting in KnowDiffuser, which significantly outperformed existing planners on the nuPlan benchmark in open-loop and closed-loop evaluations.

Recent advancements in Language Models (LMs) have demonstrated strong semantic reasoning capabilities, enabling their application in high-level decision-making for autonomous driving (AD). However, LMs operate over discrete token spaces and lack the ability to generate continuous, physically feasible trajectories required for motion planning. Meanwhile, diffusion models have proven effective at generating reliable and dynamically consistent trajectories, but often lack semantic interpretability and alignment with scene-level understanding. To address these limitations, we propose \textbf{KnowDiffuser}, a knowledge-guided motion planning framework that tightly integrates the semantic understanding of language models with the generative power of diffusion models. The framework employs a language model to infer context-aware meta-actions from structured scene representations, which are then mapped to prior trajectories that anchor the subsequent denoising process. A two-stage truncated denoising mechanism refines these trajectories efficiently, preserving both semantic alignment and physical feasibility. Experiments on the nuPlan benchmark demonstrate that KnowDiffuser significantly outperforms existing planners in both open-loop and closed-loop evaluations, establishing a robust and interpretable framework that effectively bridges the semantic-to-physical gap in AD systems.

View on arXiv PDF

Similar