AIMar 21, 2023
Large-Scale Traffic Signal Control Using Constrained Network Partition and Adaptive Deep Reinforcement LearningHankang Gu, Shangbo Wang, Xiaoguang Ma et al.
Multi-agent Deep Reinforcement Learning (MADRL) based traffic signal control becomes a popular research topic in recent years. To alleviate the scalability issue of completely centralized RL techniques and the non-stationarity issue of completely decentralized RL techniques on large-scale traffic networks, some literature utilizes a regional control approach where the whole network is firstly partitioned into multiple disjoint regions, followed by applying the centralized RL approach to each region. However, the existing partitioning rules either have no constraints on the topology of regions or require the same topology for all regions. Meanwhile, no existing regional control approach explores the performance of optimal joint action in an exponentially growing regional action space when intersections are controlled by 4-phase traffic signals (EW, EWL, NS, NSL). In this paper, we propose a novel RL training framework named RegionLight to tackle the above limitations. Specifically, the topology of regions is firstly constrained to a star network which comprises one center and an arbitrary number of leaves. Next, the network partitioning problem is modeled as an optimization problem to minimize the number of regions. Then, an Adaptive Branching Dueling Q-Network (ABDQ) model is proposed to decompose the regional control task into several joint signal control sub-tasks corresponding to particular intersections. Subsequently, these sub-tasks maximize the regional benefits cooperatively. Finally, the global control strategy for the whole network is obtained by concatenating the optimal joint actions of all regions. Experimental results demonstrate the superiority of our proposed framework over all baselines under both real and synthetic datasets in all evaluation metrics.
CVDec 28, 2025
Wavelet-based Multi-View Fusion of 4D Radar Tensor and Camera for Robust 3D Object DetectionRunwei Guan, Jianan Liu, Shaofeng Liang et al.
4D millimeter-wave (mmWave) radar has been widely adopted in autonomous driving and robot perception due to its low cost and all-weather robustness. However, point-cloud-based radar representations suffer from information loss due to multi-stage signal processing, while directly utilizing raw 4D radar tensors incurs prohibitive computational costs. To address these challenges, we propose WRCFormer, a novel 3D object detection framework that efficiently fuses raw 4D radar cubes with camera images via decoupled multi-view radar representations. Our approach introduces two key components: (1) A Wavelet Attention Module embedded in a wavelet-based Feature Pyramid Network (FPN), which enhances the representation of sparse radar signals and image data by capturing joint spatial-frequency features, thereby mitigating information loss while maintaining computational efficiency. (2) A Geometry-guided Progressive Fusion mechanism, a two-stage query-based fusion strategy that progressively aligns multi-view radar and visual features through geometric priors, enabling modality-agnostic and efficient integration without overwhelming computational overhead. Extensive experiments on the K-Radar benchmark show that WRCFormer achieves state-of-the-art performance, surpassing the best existing model by approximately 2.4% in all scenarios and 1.6% in sleet conditions, demonstrating strong robustness in adverse weather.
CVMar 11, 2025Code
Talk2PC: Enhancing 3D Visual Grounding through LiDAR and Radar Point Clouds Fusion for Autonomous DrivingRunwei Guan, Jianan Liu, Ningwei Ouyang et al.
Embodied outdoor scene understanding forms the foundation for autonomous agents to perceive, analyze, and react to dynamic driving environments. However, existing 3D understanding is predominantly based on 2D Vision-Language Models (VLMs), which collect and process limited scene-aware contexts. In contrast, compared to the 2D planar visual information, point cloud sensors such as LiDAR provide rich depth and fine-grained 3D representations of objects. Even better the emerging 4D millimeter-wave radar detects the motion trend, velocity, and reflection intensity of each object. The integration of these two modalities provides more flexible querying conditions for natural language, thereby supporting more accurate 3D visual grounding. To this end, we propose a novel method called TPCNet, the first outdoor 3D visual grounding model upon the paradigm of prompt-guided point cloud sensor combination, including both LiDAR and radar sensors. To optimally combine the features of these two sensors required by the prompt, we design a multi-fusion paradigm called Two-Stage Heterogeneous Modal Adaptive Fusion. Specifically, this paradigm initially employs Bidirectional Agent Cross-Attention (BACA), which feeds both-sensor features, characterized by global receptive fields, to the text features for querying. Moreover, we design a Dynamic Gated Graph Fusion (DGGF) module to locate the regions of interest identified by the queries. To further enhance accuracy, we devise an C3D-RECHead, based on the nearest object edge to the ego-vehicle. Experimental results demonstrate that our TPCNet, along with its individual modules, achieves the state-of-the-art performance on both the Talk2Radar and Talk2Car datasets. We release the code at https://github.com/GuanRunwei/TPCNet.
MAFeb 18, 2025
Communication Strategy on Macro-and-Micro Traffic State in Cooperative Deep Reinforcement Learning for Regional Traffic Signal ControlHankang Gu, Shangbo Wang, Dongyao Jia et al.
Adaptive Traffic Signal Control (ATSC) has become a popular research topic in intelligent transportation systems. Regional Traffic Signal Control (RTSC) using the Multi-agent Deep Reinforcement Learning (MADRL) technique has become a promising approach for ATSC due to its ability to achieve the optimum trade-off between scalability and optimality. Most existing RTSC approaches partition a traffic network into several disjoint regions, followed by applying centralized reinforcement learning techniques to each region. However, the pursuit of cooperation among RTSC agents still remains an open issue and no communication strategy for RTSC agents has been investigated. In this paper, we propose communication strategies to capture the correlation of micro-traffic states among lanes and the correlation of macro-traffic states among intersections. We first justify the evolution equation of the RTSC process is Markovian via a system of store-and-forward queues. Next, based on the evolution equation, we propose two GAT-Aggregated (GA2) communication modules--GA2-Naive and GA2-Aug to extract both intra-region and inter-region correlations between macro and micro traffic states. While GA2-Naive only considers the movements at each intersection, GA2-Aug also considers the lane-changing behavior of vehicles. Two proposed communication modules are then aggregated into two existing novel RTSC frameworks--RegionLight and Regional-DRL. Experimental results demonstrate that both GA2-Naive and GA2-Aug effectively improve the performance of existing RTSC frameworks under both real and synthetic scenarios. Hyperparameter testing also reveals the robustness and potential of our communication modules in large-scale traffic networks.
LGMar 5
U-Parking: Distributed UWB-Assisted Autonomous Parking System with Robust Localization and Intelligent PlanningYiang Wu, Qiong Wu, Pingyi Fan et al.
This demonstration presents U-Parking, a distributed Ultra-Wideband (UWB)-assisted autonomous parking system. By integrating Large Language Models (LLMs)-assisted planning with robust fusion localization and trajectory tracking, it enables reliable automated parking in challenging indoor environments, as validated through real-vehicle demonstrations.
CVMar 17, 2025
OptiPMB: Enhancing 3D Multi-Object Tracking with Optimized Poisson Multi-Bernoulli FilteringGuanhua Ding, Yuxuan Xia, Runwei Guan et al.
Accurate 3D multi-object tracking (MOT) is crucial for autonomous driving, as it enables robust perception, navigation, and planning in complex environments. While deep learning-based solutions have demonstrated impressive 3D MOT performance, model-based approaches remain appealing for their simplicity, interpretability, and data efficiency. Conventional model-based trackers typically rely on random vector-based Bayesian filters within the tracking-by-detection (TBD) framework but face limitations due to heuristic data association and track management schemes. In contrast, random finite set (RFS)-based Bayesian filtering handles object birth, survival, and death in a theoretically sound manner, facilitating interpretability and parameter tuning. In this paper, we present OptiPMB, a novel RFS-based 3D MOT method that employs an optimized Poisson multi-Bernoulli (PMB) filter while incorporating several key innovative designs within the TBD framework. Specifically, we propose a measurement-driven hybrid adaptive birth model for improved track initialization, employ adaptive detection probability parameters to effectively maintain tracks for occluded objects, and optimize density pruning and track extraction modules to further enhance overall tracking performance. Extensive evaluations on nuScenes and KITTI datasets show that OptiPMB achieves superior tracking accuracy compared with state-of-the-art methods, thereby establishing a new benchmark for model-based 3D MOT and offering valuable insights for future research on RFS-based trackers in autonomous driving.
ITMar 22, 2018
A Topological Approach to Secure Message Dissemination in Vehicular NetworksJieqiong Chen, Guoqiang Mao, Changle Li et al.
Secure message dissemination is an important issue in vehicular networks, especially considering the vulnerability of vehicle to vehicle message dissemination to malicious attacks. Traditional security mechanisms, largely based on message encryption and key management, can only guarantee secure message exchanges between known source and destination pairs. In vehicular networks however, every vehicle may learn its surrounding environment and contributes as a source, while in the meantime act as a destination or a relay of information from other vehicles, message exchanges often occur between "stranger" vehicles. For secure message dissemination in vehicular networks against insider attackers, who may tamper the content of the disseminated messages, ensuring the consistency and integrity of the transmitted messages becomes a major concern that traditional message encryption and key management based approaches fall short to provide. In this paper, by incorporating the underlying network topology information, we propose an optimal decision algorithm that is able to maximize the chance of making a correct decision on the message content, assuming the prior knowledge of the percentage of malicious vehicles in the network. Furthermore, a novel heuristic decision algorithm is proposed that can make decisions without the aforementioned knowledge of the percentage of malicious vehicles. Simulations are conducted to compare the security performance achieved by our proposed decision algorithms with that achieved by existing ones that do not consider or only partially consider the topological information, to verify the effectiveness of the algorithms. Our results show that by incorporating the network topology information, the security performance can be much improved. This work shed light on the optimum algorithm design for secure message dissemination.