Haowei Sun

RO
h-index10
6papers
182citations
Novelty53%
AI Score44

6 Papers

CVApr 23
Instance-level Visual Active Tracking with Occlusion-Aware Planning

Haowei Sun, Kai Zhou, Hao Gao et al.

Visual Active Tracking (VAT) aims to control cameras to follow a target in 3D space, which is critical for applications like drone navigation and security surveillance. However, it faces two key bottlenecks in real-world deployment: confusion from visually similar distractors caused by insufficient instance-level discrimination and severe failure under occlusions due to the absence of active planning. To address these, we propose OA-VAT, a unified pipeline with three complementary modules. First, a training-free Instance-Aware Offline Prototype Initialization aggregates multi-view augmented features via DINOv3 to construct discriminative instance prototypes, mitigating distractor confusion. Second, an Online Prototype Enhancement Tracker enhances prototypes online and integrates a confidence-aware Kalman filter for stable tracking under appearance and motion changes. Third, an Occlusion-Aware Trajectory Planner, trained on our new Planning-20k dataset, uses conditional diffusion to generate obstacle-avoiding paths for occlusion recovery. Experiments demonstrate OA-VAT achieves 0.93 average SR on UnrealCV (+2.2% vs. SOTA TrackVLA), 90.8% average CAR on real-world datasets (+12.1% vs. SOTA GC-VAT), and 81.6% TSR on a DJI Tello drone. Running at 35 FPS on an RTX 3090, it delivers robust, real-time performance for practical deployment.

RODec 1, 2024Code
Open-World Drone Active Tracking with Goal-Centered Rewards

Haowei Sun, Jinwu Hu, Zhirui Zhang et al.

Drone Visual Active Tracking aims to autonomously follow a target object by controlling the motion system based on visual observations, providing a more practical solution for effective tracking in dynamic environments. However, accurate Drone Visual Active Tracking using reinforcement learning remains challenging due to the absence of a unified benchmark and the complexity of open-world environments with frequent interference. To address these issues, we pioneer a systematic solution. First, we propose DAT, the first open-world drone active air-to-ground tracking benchmark. It encompasses 24 city-scale scenes, featuring targets with human-like behaviors and high-fidelity dynamics simulation. DAT also provides a digital twin tool for unlimited scene generation. Additionally, we propose a novel reinforcement learning method called GC-VAT, which aims to improve the performance of drone tracking targets in complex scenarios. Specifically, we design a Goal-Centered Reward to provide precise feedback across viewpoints to the agent, enabling it to expand perception and movement range through unrestricted perspectives. Inspired by curriculum learning, we introduce a Curriculum-Based Training strategy that progressively enhances the tracking performance in complex environments. Besides, experiments on simulator and real-world images demonstrate the superior performance of GC-VAT, achieving a Tracking Success Rate of approximately 72% on the simulator. The benchmark and code are available at https://github.com/SHWplus/DAT_Benchmark.

LGMay 6, 2025
RADE: Learning Risk-Adjustable Driving Environment via Multi-Agent Conditional Diffusion

Jiawei Wang, Xintao Yan, Yao Mu et al.

Generating safety-critical scenarios in high-fidelity simulations offers a promising and cost-effective approach for efficient testing of autonomous vehicles. Existing methods typically rely on manipulating a single vehicle's trajectory through sophisticated designed objectives to induce adversarial interactions, often at the cost of realism and scalability. In this work, we propose the Risk-Adjustable Driving Environment (RADE), a simulation framework that generates statistically realistic and risk-adjustable traffic scenes. Built upon a multi-agent diffusion architecture, RADE jointly models the behavior of all agents in the environment and conditions their trajectories on a surrogate risk measure. Unlike traditional adversarial methods, RADE learns risk-conditioned behaviors directly from data, preserving naturalistic multi-agent interactions with controllable risk levels. To ensure physical plausibility, we incorporate a tokenized dynamics check module that efficiently filters generated trajectories using a motion vocabulary. We validate RADE on the real-world rounD dataset, demonstrating that it preserves statistical realism across varying risk levels and naturally increases the likelihood of safety-critical events as the desired risk level grows up. Our results highlight RADE's potential as a scalable and realistic tool for AV safety evaluation.

AIFeb 6, 2021
Corner Case Generation and Analysis for Safety Assessment of Autonomous Vehicles

Haowei Sun, Shuo Feng, Xintao Yan et al.

Testing and evaluation is a crucial step in the development and deployment of Connected and Automated Vehicles (CAVs). To comprehensively evaluate the performance of CAVs, it is of necessity to test the CAVs in safety-critical scenarios, which rarely happen in naturalistic driving environment. Therefore, how to purposely and systematically generate these corner cases becomes an important problem. Most existing studies focus on generating adversarial examples for perception systems of CAVs, whereas limited efforts have been put on the decision-making systems, which is the highlight of this paper. As the CAVs need to interact with numerous background vehicles (BVs) for a long duration, variables that define the corner cases are usually high dimensional, which makes the generation a challenging problem. In this paper, a unified framework is proposed to generate corner cases for the decision-making systems. To address the challenge brought by high dimensionality, the driving environment is formulated based on Markov Decision Process, and the deep reinforcement learning techniques are applied to learn the behavior policy of BVs. With the learned policy, BVs will behave and interact with the CAVs more aggressively, resulting in more corner cases. To further analyze the generated corner cases, the techniques of feature extraction and clustering are utilized. By selecting representative cases of each cluster and outliers, the valuable corner cases can be identified from all generated corner cases. Simulation results of a highway driving environment show that the proposed methods can effectively generate and identify the valuable corner cases.

SYJan 8, 2021
Distributionally Consistent Simulation of Naturalistic Driving Environment for Autonomous Vehicle Testing

Xintao Yan, Shuo Feng, Haowei Sun et al.

Microscopic traffic simulation provides a controllable, repeatable, and efficient testing environment for autonomous vehicles (AVs). To evaluate AVs' safety performance unbiasedly, the probability distributions of environment statistics in the simulated naturalistic driving environment (NDE) need to be consistent with those from the real-world driving environment. However, although human driving behaviors have been extensively investigated in the transportation engineering field, most existing models were developed for traffic flow analysis without considering the distributional consistency of driving behaviors, which could cause significant evaluation biasedness for AV testing. To fill this research gap, a distributional consistent NDE modeling framework is proposed in this paper. Using large-scale naturalistic driving data, empirical distributions are obtained to construct the stochastic human driving behavior models under different conditions. To address the error accumulation problem during the simulation, an optimization-based method is further designed to refine the empirical behavior models. Specifically, the vehicle state evolution is modeled as a Markov chain and its stationary distribution is twisted to match the distribution from the real-world driving environment. The framework is evaluated in the case study of a multi-lane highway driving simulation, where the distributional accuracy of the generated NDE is validated and the safety performance of an AV model is effectively evaluated.

ROMay 9, 2019
Testing Scenario Library Generation for Connected and Automated Vehicles, Part II: Case Studies

Shuo Feng, Yiheng Feng, Haowei Sun et al.

Testing scenario library generation (TSLG) is a critical step for the development and deployment of connected and automated vehicles (CAVs). In Part I of this study, a general methodology for TSLG is proposed, and theoretical properties are investigated regarding the accuracy and efficiency of CAV evaluation. This paper aims to provide implementation examples and guidelines, and to enhance the proposed methodology under high-dimensional scenarios. Three typical cases, including cut-in, highway-exit, and car-following, are designed and studied in this paper. For each case, the process of library generation and CAV evaluation is elaborated. To address the challenges brought by high dimensions, the proposed methodology is further enhanced by reinforcement learning technique. For all three cases, results show that the proposed methods can accelerate the CAV evaluation process by multiple magnitudes with same evaluation accuracy, if compared with the on-road test method.