Stephen F. Smith

AI
h-index8
15papers
169citations
Novelty42%
AI Score49

15 Papers

ROSep 26, 2023Code
Probabilistic 3D Multi-Object Cooperative Tracking for Autonomous Driving via Differentiable Multi-Sensor Kalman Filter

Hsu-kuang Chiu, Chien-Yi Wang, Min-Hung Chen et al. · microsoft-research, nvidia

Current state-of-the-art autonomous driving vehicles mainly rely on each individual sensor system to perform perception tasks. Such a framework's reliability could be limited by occlusion or sensor failure. To address this issue, more recent research proposes using vehicle-to-vehicle (V2V) communication to share perception information with others. However, most relevant works focus only on cooperative detection and leave cooperative tracking an underexplored research field. A few recent datasets, such as V2V4Real, provide 3D multi-object cooperative tracking benchmarks. However, their proposed methods mainly use cooperative detection results as input to a standard single-sensor Kalman Filter-based tracking algorithm. In their approach, the measurement uncertainty of different sensors from different connected autonomous vehicles (CAVs) may not be properly estimated to utilize the theoretical optimality property of Kalman Filter-based tracking algorithms. In this paper, we propose a novel 3D multi-object cooperative tracking algorithm for autonomous driving via a differentiable multi-sensor Kalman Filter. Our algorithm learns to estimate measurement uncertainty for each detection that can better utilize the theoretical property of Kalman Filter-based tracking methods. The experiment results show that our algorithm improves the tracking accuracy by 17% with only 0.037x communication costs compared with the state-of-the-art method in V2V4Real. Our code and videos are available at https://github.com/eddyhkchiu/DMSTrack/ and https://eddyhkchiu.github.io/dmstrack.github.io/ .

ROFeb 17Code
Lifelong Scalable Multi-Agent Realistic Testbed and A Comprehensive Study on Design Choices in Lifelong AGV Fleet Management Systems

Jingtian Yan, Yulun Zhang, Zhenting Liu et al. · cmu

We present Lifelong Scalable Multi-Agent Realistic Testbed (LSMART), an open-source simulator to evaluate any Multi-Agent Path Finding (MAPF) algorithm in a Fleet Management System (FMS) with Automated Guided Vehicles (AGVs). MAPF aims to move a group of agents from their corresponding starting locations to their goals. Lifelong MAPF (LMAPF) is a variant of MAPF that continuously assigns new goals for agents to reach. LMAPF applications, such as autonomous warehouses, often require a centralized, lifelong system to coordinate the movement of a fleet of robots, typically AGVs. However, existing works on MAPF and LMAPF often assume simplified kinodynamic models, such as pebble motion, as well as perfect execution and communication for AGVs. Prior work has presented SMART, a software capable of evaluating any MAPF algorithms while considering agent kinodynamics, communication delays, and execution uncertainties. However, SMART is designed for MAPF, not LMAPF. Generalizing SMART to an FMS requires many more design choices. First, an FMS parallelizes planning and execution, raising the question of when to plan. Second, given planners with varying optimality and differing agent-model assumptions, one must decide how to plan. Third, when the planner fails to return valid solutions, the system must determine how to recover. In this paper, we first present LSMART, an open-source simulator that incorporates all these considerations to evaluate any MAPF algorithms in an FMS. We then provide experiment results based on state-of-the-art methods for each design choice, offering guidance on how to effectively design centralized lifelong AGV Fleet Management Systems. LSMART is available at https://smart-mapf.github.io/lifelong-smart.

AIOct 16, 2022
Connection-Based Scheduling for Real-Time Intersection Control

Hsu-Chieh Hu, Joseph Zhou, Gregory J. Barlow et al.

We introduce a heuristic scheduling algorithm for real-time adaptive traffic signal control to reduce traffic congestion. This algorithm adopts a lane-based model that estimates the arrival time of all vehicles approaching an intersection through different lanes, and then computes a schedule (i.e., a signal timing plan) that minimizes the cumulative delay incurred by all approaching vehicles. State space, pruning checks and an admissible heuristic for A* search are described and shown to be capable of generating an intersection schedule in real-time (i.e., every second). Due to the effectiveness of the heuristics, the proposed approach outperforms a less expressive Dynamic Programming approach and previous A*-based approaches in run-time performance, both in simulated test environments and actual field tests.

CVFeb 14, 2025Code
V2V-LLM: Vehicle-to-Vehicle Cooperative Autonomous Driving with Multi-Modal Large Language Models

Hsu-kuang Chiu, Ryo Hachiuma, Chien-Yi Wang et al.

Current autonomous driving vehicles rely mainly on their individual sensors to understand surrounding scenes and plan for future trajectories, which can be unreliable when the sensors are malfunctioning or occluded. To address this problem, cooperative perception methods via vehicle-to-vehicle (V2V) communication have been proposed, but they have tended to focus on perception tasks like detection or tracking. How those approaches contribute to overall cooperative planning performance is still under-explored. Inspired by recent progress using Large Language Models (LLMs) to build autonomous driving systems, we propose a novel problem setting that integrates a Multi-Modal LLM into cooperative autonomous driving, with the proposed Vehicle-to-Vehicle Question-Answering (V2V-QA) dataset and benchmark. We also propose our baseline method Vehicle-to-Vehicle Multi-Modal Large Language Model (V2V-LLM), which uses an LLM to fuse perception information from multiple connected autonomous vehicles (CAVs) and answer various types of driving-related questions: grounding, notable object identification, and planning. Experimental results show that our proposed V2V-LLM can be a promising unified model architecture for performing various tasks in cooperative autonomous driving, and outperforms other baseline methods that use different fusion approaches. Our work also creates a new research direction that can improve the safety of future autonomous driving systems. The code and data will be released to the public to facilitate open-source research in this field. Our project website: https://eddyhkchiu.github.io/v2vllm.github.io/ .

CVJun 20, 2023
Collision Avoidance Detour for Multi-Agent Trajectory Forecasting

Hsu-kuang Chiu, Stephen F. Smith

We present our approach, Collision Avoidance Detour (CAD), which won the 3rd place award in the 2023 Waymo Open Dataset Challenge - Sim Agents, held at the 2023 CVPR Workshop on Autonomous Driving. To satisfy the motion prediction factorization requirement, we partition all the valid objects into three mutually exclusive sets: Autonomous Driving Vehicle (ADV), World-tracks-to-predict, and World-others. We use different motion models to forecast their future trajectories independently. Furthermore, we also apply collision avoidance detour resampling, additive Gaussian noise, and velocity-based heading estimation to improve the realism of our simulation result.

ROMar 3, 2025Code
Advancing MAPF towards the Real World: A Scalable Multi-Agent Realistic Testbed (SMART)

Jingtian Yan, Zhifei Li, William Kang et al. · cmu

We present Scalable Multi-Agent Realistic Testbed (SMART), a realistic and efficient software tool for evaluating Multi-Agent Path Finding (MAPF) algorithms. MAPF focuses on planning collision-free paths for a group of agents. While state-ofthe-art MAPF algorithms can plan paths for hundreds of robots in seconds, they often rely on simplified robot models, making their real-world performance unclear. Researchers typically lack access to hundreds of physical robots in laboratory settings to evaluate the algorithms. Meanwhile, industrial professionals who lack expertise in MAPF require an easy-to-use simulator to efficiently test and understand the performance of MAPF algorithms in their specific settings. SMART fills this gap with several advantages: (1) SMART uses physics-engine-based simulators to create realistic simulation environments, accounting for complex real-world factors such as robot kinodynamics and execution uncertainties, (2) SMART uses an execution monitor framework based on the Action Dependency Graph, facilitating seamless integration with various MAPF algorithms and robot models, and (3) SMART scales to thousands of robots. The code is publicly available at https://github.com/smart-mapf/smart.

AIDec 10, 2025
Analyzing Planner Design Trade-offs for MAPF under Realistic Simulation

Jingtian Yan, Zhifei Li, William Kang et al.

Multi-Agent Path Finding (MAPF) algorithms are increasingly deployed in industrial warehouses and automated manufacturing facilities, where robots must operate reliably under real-world physical constraints. However, existing MAPF evaluation frameworks typically rely on simplified robot models, leaving a substantial gap between algorithmic benchmarks and practical performance. Recent frameworks such as SMART, incorporate kinodynamic modeling and offer the MAPF community a platform for large-scale, realistic evaluation. Building on this capability, this work investigates how key planner design choices influence performance under realistic execution settings. We systematically study three fundamental factors: (1) the relationship between solution optimality and execution performance, (2) the sensitivity of system performance to inaccuracies in kinodynamic modeling, and (3) the interaction between model accuracy and plan optimality. Empirically, we examine these factors to understand how these design choices affect performance in realistic scenarios. We highlight open challenges and research directions to steer the community toward practical, real-world deployment.

RONov 26, 2025
Bridging Planning and Execution: Multi-Agent Path Finding Under Real-World Deadlines

Jingtian Yan, Shuai Zhou, Stephen F. Smith et al.

The Multi-Agent Path Finding (MAPF) problem aims to find collision-free paths for multiple agents while optimizing objectives such as the sum of costs or makespan. MAPF has wide applications in domains like automated warehouses, manufacturing systems, and airport logistics. However, most MAPF formulations assume a simplified robot model for planning, which overlooks execution-time factors such as kinodynamic constraints, communication latency, and controller variability. This gap between planning and execution is problematic for time-sensitive applications. To bridge this gap, we propose REMAP, an execution-informed MAPF planning framework that can be combined with leading search-based MAPF planners with minor changes. Our framework integrates the proposed ExecTimeNet to accurately estimate execution time based on planned paths. We demonstrate our method for solving MAPF with Real-world Deadlines (MAPF-RD) problem, where agents must reach their goals before a predefined wall-clock time. We integrate our framework with two popular MAPF methods, MAPF-LNS and CBS. Experiments show that REMAP achieves up to 20% improvement in solution quality over baseline methods (e.g., constant execution speed estimators) on benchmark maps with up to 300 agents.

AIAug 2, 2025
WinkTPG: An Execution Framework for Multi-Agent Path Finding Using Temporal Reasoning

Jingtian Yan, Stephen F. Smith, Jiaoyang Li

Planning collision-free paths for a large group of agents is a challenging problem with numerous real-world applications. While recent advances in Multi-Agent Path Finding (MAPF) have shown promising progress, standard MAPF algorithms rely on simplified kinodynamic models, preventing agents from directly following the generated MAPF plan. To bridge this gap, we propose kinodynamic Temporal Plan Graph Planning (kTPG), a multi-agent speed optimization algorithm that efficiently refines a MAPF plan into a kinodynamically feasible plan while accounting for uncertainties and preserving collision-freeness. Building on kTPG, we propose Windowed kTPG (WinkTPG), a MAPF execution framework that incrementally refines MAPF plans using a window-based mechanism, dynamically incorporating agent information during execution to reduce uncertainty. Experiments show that WinkTPG can generate speed profiles for up to 1,000 agents in 1 second and improves solution quality by up to 51.7% over existing MAPF execution methods.

ROMay 26, 2023
Selective Communication for Cooperative Perception in End-to-End Autonomous Driving

Hsu-kuang Chiu, Stephen F. Smith

The reliability of current autonomous driving systems is often jeopardized in situations when the vehicle's field-of-view is limited by nearby occluding objects. To mitigate this problem, vehicle-to-vehicle communication to share sensor information among multiple autonomous driving vehicles has been proposed. However, to enable timely processing and use of shared sensor data, it is necessary to constrain communication bandwidth, and prior work has done so by restricting the number of other cooperative vehicles and randomly selecting the subset of vehicles to exchange information with from all those that are within communication range. Although simple and cost effective from a communication perspective, this selection approach suffers from its susceptibility to missing those vehicles that possess the perception information most critical to navigation planning. Inspired by recent multi-agent path finding research, we propose a novel selective communication algorithm for cooperative perception to address this shortcoming. Implemented with a lightweight perception network and a previously developed control network, our algorithm is shown to produce higher success rates than a random selection approach on previously studied safety-critical driving scenario simulations, with minimal additional communication overhead.

ROJul 3, 2019
Cooperative Schedule-Driven Intersection Control with Connected and Autonomous Vehicles

Hsu-Chieh Hu, Stephen F. Smith, Rick Goldstein

Recent work in decentralized, schedule-driven traffic control has demonstrated the ability to improve the efficiency of traffic flow in complex urban road networks. In this approach, a scheduling agent is associated with each intersection. Each agent senses the traffic approaching its intersection and in real-time constructs a schedule that minimizes the cumulative wait time of vehicles approaching the intersection over the current look-ahead horizon. In this paper, we propose a cooperative algorithm that utilizes both connected and autonomous vehicles (CAV) and schedule-driven traffic control to create better traffic flow in the city. The algorithm enables an intersection scheduling agent to adjust the arrival time of an approaching platoon through use of wireless communication to control the velocity of vehicles. The sequence of approaching platoons is thus shifted toward a new shape that has smaller cumulative delay. We demonstrate how this algorithm outperforms the original approach in a real-time traffic signal control problem.

AIJul 3, 2019
Using Bi-Directional Information Exchange to Improve Decentralized Schedule-Driven Traffic Control

Hsu-Chieh Hu, Stephen F. Smith

Recent work in decentralized, schedule-driven traffic control has demonstrated the ability to improve the efficiency of traffic flow in complex urban road networks. In this approach, a scheduling agent is associated with each intersection. Each agent senses the traffic approaching its intersection and in real-time constructs a schedule that minimizes the cumulative wait time of vehicles approaching the intersection over the current look-ahead horizon. In order to achieve network level coordination in a scalable manner, scheduling agents communicate only with their direct neighbors. Each time an agent generates a new intersection schedule it communicates its expected outflows to its downstream neighbors as a prediction of future demand and these outflows are appended to the downstream agent's locally perceived demand. In this paper, we extend this basic coordination algorithm to additionally incorporate the complementary flow of information reflective of an intersection's current congestion level to its upstream neighbors. We present an asynchronous decentralized algorithm for updating intersection schedules and congestion level estimates based on these bi-directional information flows. By relating this algorithm to the self-optimized decision making of the basic operation, we are able to approach network-wide optimality and reduce inefficiency due to strictly self-interested intersection control decisions.

AIMar 6, 2019
Coping with Large Traffic Volumes in Schedule-Driven Traffic Signal Control

Hsu-Chieh Hu, Stephen F. Smith

Recent work in decentralized, schedule-driven traffic control has demonstrated the ability to significantly improve traffic flow efficiency in complex urban road networks. However, in situations where vehicle volumes increase to the point that the physical capacity of a road network reaches or exceeds saturation, it has been observed that the effectiveness of a schedule-driven approach begins to degrade, leading to progressively higher network congestion. In essence, the traffic control problem becomes less of a scheduling problem and more of a queue management problem in this circumstance. In this paper we propose a composite approach to real-time traffic control that uses sensed information on queue lengths to influence scheduling decisions and gracefully shift the signal control strategy to queue management in high volume/high congestion settings. Specifically, queue-length information is used to establish weights for the sensed vehicle clusters that must be scheduled through a given intersection at any point, and hence bias the wait time minimization calculation. To compute these weights, we develop a model in which successive movement phases are viewed as different states of an Ising model, and parameters quantify strength of interactions. To ensure scalability, queue information is only exchanged between direct neighbors and the asynchronous nature of local intersection scheduling is preserved. We demonstrate the potential of the approach through microscopic traffic simulation of a real-world road network, showing a 60% reduction in average wait times over the baseline schedule-driven approach in heavy traffic scenarios. We also report initial field test results, which show the ability to reduce queues during heavy traffic periods.

MAJan 1, 2014
Design of a GIS-based Assistant Software Agent for the Incident Commander to Coordinate Emergency Response Operations

Reza Nourjou, Michinori Hatayama, Stephen F. Smith et al.

Problem: This paper addresses the design of an intelligent software system for the IC (incident commander) of a team in order to coordinate actions of agents (field units or robots) in the domain of emergency/crisis response operations. Objective: This paper proposes GICoordinator. It is a GIS-based assistant software agent that assists and collaborates with the human planner in strategic planning and macro tasks assignment for centralized multi-agent coordination. Method: Our approach to design GICoordinator was to: analyze the problem, design a complete data model, design an architecture of GICoordinator, specify required capabilities of human and system in coordination problem solving, specify development tools, and deploy. Result: The result was an architecture/design of GICoordinator that contains system requirements. Findings: GICoordinator efficiently integrates geoinformatics with artifice intelligent techniques in order to provide a spatial intelligent coordinator system for an IC to efficiently coordinate and control agents by making macro/strategic decisions. Results define a framework for future works to develop this system.

AIJun 13, 2012
New Techniques for Algorithm Portfolio Design

Matthew Streeter, Stephen F. Smith

We present and evaluate new techniques for designing algorithm portfolios. In our view, the problem has both a scheduling aspect and a machine learning aspect. Prior work has largely addressed one of the two aspects in isolation. Building on recent work on the scheduling aspect of the problem, we present a technique that addresses both aspects simultaneously and has attractive theoretical guarantees. Experimentally, we show that this technique can be used to improve the performance of state-of-the-art algorithms for Boolean satisfiability, zero-one integer programming, and A.I. planning.