RO CV LGMar 13, 2025

Learning Personalized Driving Styles via Reinforcement Learning from Human Feedback

Derun Li, Changye Li, Yue Wang, Jianwei Ren, Xin Wen, Pengxiang Li, Leimeng Xu, Kun Zhan, Peng Jia, Xianpeng Lang, Ningyi Xu, Hang Zhao

arXiv:2503.10434v222.813 citationsh-index: 13

Originality Incremental advance

AI Analysis

This addresses the need for adaptive and human-like trajectory generation in autonomous driving, though it appears incremental as it builds on existing generative models with human feedback integration.

The paper tackles the problem of generating personalized driving styles in autonomous vehicles by introducing TrajHF, a human feedback-driven finetuning framework that refines generative trajectory models using reinforcement learning, achieving performance comparable to state-of-the-art on the NavSim benchmark.

Generating human-like and adaptive trajectories is essential for autonomous driving in dynamic environments. While generative models have shown promise in synthesizing feasible trajectories, they often fail to capture the nuanced variability of personalized driving styles due to dataset biases and distributional shifts. To address this, we introduce TrajHF, a human feedback-driven finetuning framework for generative trajectory models, designed to align motion planning with diverse driving styles. TrajHF incorporates multi-conditional denoiser and reinforcement learning with human feedback to refine multi-modal trajectory generation beyond conventional imitation learning. This enables better alignment with human driving preferences while maintaining safety and feasibility constraints. TrajHF achieves performance comparable to the state-of-the-art on NavSim benchmark. TrajHF sets a new paradigm for personalized and adaptable trajectory generation in autonomous driving.

View on arXiv PDF

Similar