AINov 28, 2025

Evolutionary Discovery of Heuristic Policies for Traffic Signal Control

Ruibing Wang, Shuhan Guo, Zeen Li, Zhen Wang, Quanming Yao

arXiv:2511.23122v15.81 citations

Originality Incremental advance

AI Analysis

This addresses traffic signal control for urban management, offering a novel approach that combines reasoning with optimization, though it appears incremental in integrating LLMs into evolutionary frameworks.

The paper tackled the problem of traffic signal control by proposing a method that uses LLMs as an evolution engine to derive specialized heuristic policies, resulting in lightweight, robust policies that outperform existing heuristics and online LLM actors.

Traffic Signal Control (TSC) involves a challenging trade-off: classic heuristics are efficient but oversimplified, while Deep Reinforcement Learning (DRL) achieves high performance yet suffers from poor generalization and opaque policies. Online Large Language Models (LLMs) provide general reasoning but incur high latency and lack environment-specific optimization. To address these issues, we propose Temporal Policy Evolution for Traffic (\textbf{\method{}}), which uses LLMs as an evolution engine to derive specialized heuristic policies. The framework introduces two key modules: (1) Structured State Abstraction (SSA), converting high-dimensional traffic data into temporal-logical facts for reasoning; and (2) Credit Assignment Feedback (CAF), tracing flawed micro-decisions to poor macro-outcomes for targeted critique. Operating entirely at the prompt level without training, \method{} yields lightweight, robust policies optimized for specific traffic environments, outperforming both heuristics and online LLM actors.

View on arXiv PDF

Similar