Liting Sun

RO
h-index30
37papers
1,956citations
Novelty49%
AI Score40

37 Papers

LGJul 7, 2022Code
CausalAgents: A Robustness Benchmark for Motion Forecasting using Causal Relationships

Rebecca Roelofs, Liting Sun, Ben Caine et al.

As machine learning models become increasingly prevalent in motion forecasting for autonomous vehicles (AVs), it is critical to ensure that model predictions are safe and reliable. However, exhaustively collecting and labeling the data necessary to fully test the long tail of rare and challenging scenarios is difficult and expensive. In this work, we construct a new benchmark for evaluating and improving model robustness by applying perturbations to existing data. Specifically, we conduct an extensive labeling effort to identify causal agents, or agents whose presence influences human drivers' behavior in any format, in the Waymo Open Motion Dataset (WOMD), and we use these labels to perturb the data by deleting non-causal agents from the scene. We evaluate a diverse set of state-of-the-art deep-learning model architectures on our proposed benchmark and find that all models exhibit large shifts under even non-causal perturbation: we observe a 25-38% relative change in minADE as compared to the original. We also investigate techniques to improve model robustness, including increasing the training dataset size and using targeted data augmentations that randomly drop non-causal agents throughout training. Finally, we release the causal agent labels (at https://github.com/google-research/causal-agents) as an additional attribute to WOMD and the robustness benchmarks to aid the community in building more reliable and safe deep-learning models for motion forecasting.

CVOct 30, 2025
WOD-E2E: Waymo Open Dataset for End-to-End Driving in Challenging Long-tail Scenarios

Runsheng Xu, Hubert Lin, Wonseok Jeon et al.

Vision-based end-to-end (E2E) driving has garnered significant interest in the research community due to its scalability and synergy with multimodal large language models (MLLMs). However, current E2E driving benchmarks primarily feature nominal scenarios, failing to adequately test the true potential of these systems. Furthermore, existing open-loop evaluation metrics often fall short in capturing the multi-modal nature of driving or effectively evaluating performance in long-tail scenarios. To address these gaps, we introduce the Waymo Open Dataset for End-to-End Driving (WOD-E2E). WOD-E2E contains 4,021 driving segments (approximately 12 hours), specifically curated for challenging long-tail scenarios that that are rare in daily life with an occurring frequency of less than 0.03%. Concretely, each segment in WOD-E2E includes the high-level routing information, ego states, and 360-degree camera views from 8 surrounding cameras. To evaluate the E2E driving performance on these long-tail situations, we propose a novel open-loop evaluation metric: Rater Feedback Score (RFS). Unlike conventional metrics that measure the distance between predicted way points and the logs, RFS measures how closely the predicted trajectory matches rater-annotated trajectory preference labels. We have released rater preference labels for all WOD-E2E validation set segments, while the held out test set labels have been used for the 2025 WOD-E2E Challenge. Through our work, we aim to foster state of the art research into generalizable, robust, and safe end-to-end autonomous driving agents capable of handling complex real-world situations.

ROFeb 10, 2022
Transferable and Adaptable Driving Behavior Prediction

Letian Wang, Yeping Hu, Liting Sun et al.

While autonomous vehicles still struggle to solve challenging situations during on-road driving, humans have long mastered the essence of driving with efficient, transferable, and adaptable driving capability. By mimicking humans' cognition model and semantic understanding during driving, we propose HATN, a hierarchical framework to generate high-quality, transferable, and adaptable predictions for driving behaviors in multi-agent dense-traffic environments. Our hierarchical method consists of a high-level intention identification policy and a low-level trajectory generation policy. We introduce a novel semantic sub-task definition and generic state representation for each sub-task. With these techniques, the hierarchical framework is transferable across different driving scenarios. Besides, our model is able to capture variations of driving behaviors among individuals and scenarios by an online adaptation module. We demonstrate our algorithms in the task of trajectory prediction for real traffic data at intersections and roundabouts from the INTERACTION dataset. Through extensive numerical studies, it is evident that our method significantly outperformed other methods in terms of prediction accuracy, transferability, and adaptability. Pushing the state-of-the-art performance by a considerable margin, we also provide a cognitive view of understanding the driving behavior behind such improvement. We highlight that in the future, more research attention and effort are deserved for transferability and adaptability. It is not only due to the promising performance elevation of prediction and planning algorithms, but more fundamentally, they are crucial for the scalable and general deployment of autonomous vehicles.

RONov 1, 2021
Hierarchical Adaptable and Transferable Networks (HATN) for Driving Behavior Prediction

Letian Wang, Yeping Hu, Liting Sun et al.

When autonomous vehicles still struggle to solve challenging situations during on-road driving, humans have long mastered the essence of driving with efficient transferable and adaptable driving capability. By mimicking humans' cognition model and semantic understanding during driving, we present HATN, a hierarchical framework to generate high-quality driving behaviors in multi-agent dense-traffic environments. Our method hierarchically consists of a high-level intention identification and low-level action generation policy. With the semantic sub-task definition and generic state representation, the hierarchical framework is transferable across different driving scenarios. Besides, our model is also able to capture variations of driving behaviors among individuals and scenarios by an online adaptation module. We demonstrate our algorithms in the task of trajectory prediction for real traffic data at intersections and roundabouts, where we conducted extensive studies of the proposed method and demonstrated how our method outperformed other methods in terms of prediction accuracy and transferability.

ROSep 29, 2021
Safety Assurances for Human-Robot Interaction via Confidence-aware Game-theoretic Human Models

Ran Tian, Liting Sun, Andrea Bajcsy et al.

An outstanding challenge with safety methods for human-robot interaction is reducing their conservatism while maintaining robustness to variations in human behavior. In this work, we propose that robots use confidence-aware game-theoretic models of human behavior when assessing the safety of a human-robot interaction. By treating the influence between the human and robot as well as the human's rationality as unobserved latent states, we succinctly infer the degree to which a human is following the game-theoretic interaction model. We leverage this model to restrict the set of feasible human controls during safety verification, enabling the robot to confidently modulate the conservatism of its safety monitor online. Evaluations in simulated human-robot scenarios and ablation studies demonstrate that imbuing safety monitors with confidence-aware game-theoretic models enables both safe and efficient human-robot interaction. Moreover, evaluations with real traffic data show that our safety monitor is less conservative than traditional safety methods in real human driving scenarios.

ROSep 26, 2021
Anytime Game-Theoretic Planning with Active Reasoning About Humans' Latent States for Human-Centered Robots

Ran Tian, Liting Sun, Masayoshi Tomizuka et al.

A human-centered robot needs to reason about the cognitive limitation and potential irrationality of its human partner to achieve seamless interactions. This paper proposes an anytime game-theoretic planner that integrates iterative reasoning models, a partially observable Markov decision process, and chance-constrained Monte-Carlo belief tree search for robot behavioral planning. Our planner enables a robot to safely and actively reason about its human partner's latent cognitive states (bounded intelligence and irrationality) in real-time to maximize its utility better. We validate our approach in an autonomous driving domain where our behavioral planner and a low-level motion controller hierarchically control an autonomous car to negotiate traffic merges. Simulations and user studies are conducted to show our planner's effectiveness.

ROSep 3, 2021
Iterative Imitation Policy Improvement for Interactive Autonomous Driving

Zhao-Heng Yin, Chenran Li, Liting Sun et al.

We propose an imitation learning system for autonomous driving in urban traffic with interactions. We train a Behavioral Cloning~(BC) policy to imitate driving behavior collected from the real urban traffic, and apply the data aggregation algorithm to improve its performance iteratively. Applying data aggregation in this setting comes with two challenges. The first challenge is that it is expensive and dangerous to collect online rollout data in the real urban traffic. Creating similar traffic scenarios in simulator like CARLA for online rollout collection can also be difficult. Instead, we propose to create a weak simulator from the training dataset, in which all the surrounding vehicles follow the data trajectory provided by the dataset. We find that the collected online data in such a simulator can still be used to improve BC policy's performance. The second challenge is the tedious and time-consuming process of human labelling process during online rollout. To solve this problem, we use an A$^*$ planner as a pseudo-expert to provide expert-like demonstration. We validate our proposed imitation learning system in the real urban traffic scenarios. The experimental results show that our system can significantly improve the performance of baseline BC policy.

LGAug 14, 2021
A Microscopic Pandemic Simulator for Pandemic Prediction Using Scalable Million-Agent Reinforcement Learning

Zhenggang Tang, Kai Yan, Liting Sun et al.

Microscopic epidemic models are powerful tools for government policy makers to predict and simulate epidemic outbreaks, which can capture the impact of individual behaviors on the macroscopic phenomenon. However, existing models only consider simple rule-based individual behaviors, limiting their applicability. This paper proposes a deep-reinforcement-learning-powered microscopic model named Microscopic Pandemic Simulator (MPS). By replacing rule-based agents with rational agents whose behaviors are driven to maximize rewards, the MPS provides a better approximation of real world dynamics. To efficiently simulate with massive amounts of agents in MPS, we propose Scalable Million-Agent DQN (SMADQN). The MPS allows us to efficiently evaluate the impact of different government strategies. This paper first calibrates the MPS against real-world data in Allegheny, US, then demonstratively evaluates two government strategies: information disclosure and quarantine. The results validate the effectiveness of the proposed method. As a broad impact, this paper provides novel insights for the application of DRL in large scale agent-based networks such as economic and social networks.

ROAug 14, 2021
Constrained Iterative LQG for Real-Time Chance-Constrained Gaussian Belief Space Planning

Jianyu Chen, Yutaka Shimizu, Liting Sun et al.

Motion planning under uncertainty is of significant importance for safety-critical systems such as autonomous vehicles. Such systems have to satisfy necessary constraints (e.g., collision avoidance) with potential uncertainties coming from either disturbed system dynamics or noisy sensor measurements. However, existing motion planning methods cannot efficiently find the robust optimal solutions under general nonlinear and non-convex settings. In this paper, we formulate such problem as chance-constrained Gaussian belief space planning and propose the constrained iterative Linear Quadratic Gaussian (CILQG) algorithm as a real-time solution. In this algorithm, we iteratively calculate a Gaussian approximation of the belief and transform the chance-constraints. We evaluate the effectiveness of our method in simulations of autonomous driving planning tasks with static and dynamic obstacles. Results show that CILQG can handle uncertainties more appropriately and has faster computation time than baseline methods.

ROJun 4, 2021
Negotiation-Aware Reachability-Based Safety Verification for AutonomousDriving in Interactive Scenarios

Ran Tian, Anjian Li, Masayoshi Tomizuka et al.

Safety assurance is a critical yet challenging aspect when developing self-driving technologies. Hamilton-Jacobi backward-reachability analysis is a formal verification tool for verifying the safety of dynamic systems in the presence of disturbances. However, the standard approach is too conservative to be applied to self-driving applications due to its worst-case assumption on humans' behaviors (i.e., guard against worst-case outcomes). In this work, we integrate a learning-based prediction algorithm and a game-theoretic human behavioral model to online update the conservativeness of backward-reachability analysis. We evaluate our approach using real driving data. The results show that, with reasonable assumptions on human behaviors, our approach can effectively reduce the conservativeness of the standard approach without sacrificing its safety verification ability.

AIMar 9, 2021
On complementing end-to-end human behavior predictors with planning

Liting Sun, Xiaogang Jia, Anca D. Dragan

High capacity end-to-end approaches for human motion (behavior) prediction have the ability to represent subtle nuances in human behavior, but struggle with robustness to out of distribution inputs and tail events. Planning-based prediction, on the other hand, can reliably output decent-but-not-great predictions: it is much more stable in the face of distribution shift (as we verify in this work), but it has high inductive bias, missing important aspects that drive human decisions, and ignoring cognitive biases that make human behavior suboptimal. In this work, we analyze one family of approaches that strive to get the best of both worlds: use the end-to-end predictor on common cases, but do not rely on it for tail events / out-of-distribution inputs -- switch to the planning-based predictor there. We contribute an analysis of different approaches for detecting when to make this switch, using an autonomous driving domain. We find that promising approaches based on ensembling or generative modeling of the training distribution might not be reliable, but that there very simple methods which can perform surprisingly well -- including training a classifier to pick up on tell-tale issues in predicted trajectories.

AIMar 7, 2021
Learning Human Rewards by Inferring Their Latent Intelligence Levels in Multi-Agent Games: A Theory-of-Mind Approach with Application to Driving Data

Ran Tian, Masayoshi Tomizuka, Liting Sun

Reward function, as an incentive representation that recognizes humans' agency and rationalizes humans' actions, is particularly appealing for modeling human behavior in human-robot interaction. Inverse Reinforcement Learning is an effective way to retrieve reward functions from demonstrations. However, it has always been challenging when applying it to multi-agent settings since the mutual influence between agents has to be appropriately modeled. To tackle this challenge, previous work either exploits equilibrium solution concepts by assuming humans as perfectly rational optimizers with unbounded intelligence or pre-assigns humans' interaction strategies a priori. In this work, we advocate that humans are bounded rational and have different intelligence levels when reasoning about others' decision-making process, and such an inherent and latent characteristic should be accounted for in reward learning algorithms. Hence, we exploit such insights from Theory-of-Mind and propose a new multi-agent Inverse Reinforcement Learning framework that reasons about humans' latent intelligence levels during learning. We validate our approach in both zero-sum and general-sum games with synthetic agents and illustrate a practical application to learning human drivers' reward functions from real driving data. We compare our approach with two baseline algorithms. The results show that by reasoning about humans' latent intelligence levels, the proposed approach has more flexibility and capability to retrieve reward functions that explain humans' driving behaviors better.

ROMar 1, 2021
Diverse Critical Interaction Generation for Planning and Planner Evaluation

Zhao-Heng Yin, Lingfeng Sun, Liting Sun et al.

Generating diverse and comprehensive interacting agents to evaluate the decision-making modules is essential for the safe and robust planning of autonomous vehicles~(AV). Due to efficiency and safety concerns, most researchers choose to train interactive adversary~(competitive or weakly competitive) agents in simulators and generate test cases to interact with evaluated AVs. However, most existing methods fail to provide both natural and critical interaction behaviors in various traffic scenarios. To tackle this problem, we propose a styled generative model RouteGAN that generates diverse interactions by controlling the vehicles separately with desired styles. By altering its style coefficients, the model can generate trajectories with different safety levels serve as an online planner. Experiments show that our model can generate diverse interactions in various scenarios. We evaluate different planners with our model by testing their collision rate in interaction with RouteGAN planners of multiple critical levels.

ROFeb 13, 2021
Learning Variable Impedance Control via Inverse Reinforcement Learning for Force-Related Tasks

Xiang Zhang, Liting Sun, Zhian Kuang et al.

Many manipulation tasks require robots to interact with unknown environments. In such applications, the ability to adapt the impedance according to different task phases and environment constraints is crucial for safety and performance. Although many approaches based on deep reinforcement learning (RL) and learning from demonstration (LfD) have been proposed to obtain variable impedance skills on contact-rich manipulation tasks, these skills are typically task-specific and could be sensitive to changes in task settings. This paper proposes an inverse reinforcement learning (IRL) based approach to recover both the variable impedance policy and reward function from expert demonstrations. We explore different action space of the reward functions to achieve a more general representation of expert variable impedance skills. Experiments on two variable impedance tasks (Peg-in-Hole and Cup-on-Plate) were conducted in both simulations and on a real FANUC LR Mate 200iD/7L industrial robot. The comparison results with behavior cloning and force-based IRL proved that the learned reward function in the gain action space has better transferability than in the force space. Experiment videos are available at https://msc.berkeley.edu/research/impedance-irl.html.

SYFeb 6, 2021
Practical Fractional-Order Variable-Gain Super-Twisting Control with Application to Wafer Stages of Photolithography Systems

Zhian Kuang, Liting Sun, Huijun Gao et al.

In this paper, a practical fractional-order variable-gain super-twisting algorithm (PFVSTA) is proposed to improve the tracking performance of wafer stages for semiconductor manufacturing. Based on the sliding mode control (SMC), the proposed PFVSTA enhances the tracking performance from three aspects: 1) alleviating the chattering phenomenon via super-twisting algorithm and a novel fractional-order sliding surface~(FSS) design, 2) improving the dynamics of states on the sliding surface with fast response and small overshoots via the designed novel FSS and 3) compensating for disturbances via variable-gain control law. Based on practical conditions, this paper analyzes the stability of the controller and illustrates the theoretical principle to compensate for the uncertainties caused by accelerations. Moreover, numerical simulations prove the effectiveness of the proposed sliding surface and control scheme, and they are in agreement with the theoretical analysis. Finally, practice-based comparative experiments are conducted. The results show that the proposed PFVSTA can achieve much better tracking performance than the conventional methods from various perspectives.

ROFeb 6, 2021
Feedback-based Digital Higher-order Terminal Sliding Mode for 6-DOF Industrial Manipulators

Zhian Kuang, Xiang Zhang, Liting Sun et al.

The precise motion control of a multi-degree of freedom~(DOF) robot manipulator is always challenging due to its nonlinear dynamics, disturbances, and uncertainties. Because most manipulators are controlled by digital signals, a novel higher-order sliding mode controller in the discrete-time form with time delay estimation is proposed in this paper. The dynamic model of the manipulator used in the design allows proper handling of nonlinearities, uncertainties and disturbances involved in the problem. Specifically, parametric uncertainties and disturbances are handled by the time delay estimation and the nonlinearity of the manipulator is addressed by the feedback structure of the controller. The combination of terminal sliding mode surface and higher-order control scheme in the controller guarantees a fast response with a small chattering amplitude. Moreover, the controller is designed with a modified sliding mode surface and variable-gain structure, so that the performance of the controller is further enhanced. We also analyze the condition to guarantee the stability of the closed-loop system in this paper. Finally, the simulation and experimental results prove that the proposed control scheme has a precise performance in a robot manipulator system.

SYJan 31, 2021
Precise Motion Control of Wafer Stages via Adaptive Neural Network and Fractional-Order Super-Twisting Algorithm

Zhian Kuang, Liting Sun, Huijun Gao et al.

To obtain precise motion control of wafer stages, an adaptive neural network and fractional-order super-twisting control strategy is proposed. Based on sliding mode control (SMC), the proposed controller aims to address two challenges in SMC: 1) reducing the chattering phenomenon, and 2) attenuating the influence of model uncertainties and disturbances. For the first challenge, a fractional-order terminal sliding mode surface and a super-twisting algorithm are integrated into the SMC design. To attenuate uncertainties and disturbances, an add-on control structure based on the radial basis function (RBF) neural network is introduced. Stability analysis of the closed-loop control system is provided. Finally, experiments on a wafer stage testbed system are conducted, which proves that the proposed controller can robustly improve the tracking performance in the presence of uncertainties and disturbances compared to conventional and previous controllers.

ROJan 17, 2021
A Safe Hierarchical Planning Framework for Complex Driving Scenarios based on Reinforcement Learning

Jinning Li, Liting Sun, Jianyu Chen et al.

Autonomous vehicles need to handle various traffic conditions and make safe and efficient decisions and maneuvers. However, on the one hand, a single optimization/sampling-based motion planner cannot efficiently generate safe trajectories in real time, particularly when there are many interactive vehicles near by. On the other hand, end-to-end learning methods cannot assure the safety of the outcomes. To address this challenge, we propose a hierarchical behavior planning framework with a set of low-level safe controllers and a high-level reinforcement learning algorithm (H-CtRL) as a coordinator for the low-level controllers. Safety is guaranteed by the low-level optimization/sampling-based controllers, while the high-level reinforcement learning algorithm makes H-CtRL an adaptive and efficient behavior planner. To train and test our proposed algorithm, we built a simulator that can reproduce traffic scenes using real-world datasets. The proposed H-CtRL is proved to be effective in various realistic simulation scenarios, with satisfying performance in terms of both safety and efficiency.

ROJan 15, 2021
Interaction-Aware Behavior Planning for Autonomous Vehicles Validated with Real Traffic Data

Jinning Li, Liting Sun, Wei Zhan et al.

Autonomous vehicles (AVs) need to interact with other traffic participants who can be either cooperative or aggressive, attentive or inattentive. Such different characteristics can lead to quite different interactive behaviors. Hence, to achieve safe and efficient autonomous driving, AVs need to be aware of such uncertainties when they plan their own behaviors. In this paper, we formulate such a behavior planning problem as a partially observable Markov Decision Process (POMDP) where the cooperativeness of other traffic participants is treated as an unobservable state. Under different cooperativeness levels, we learn the human behavior models from real traffic data via the principle of maximum likelihood. Based on that, the POMDP problem is solved by Monte-Carlo Tree Search. We verify the proposed algorithm in both simulations and real traffic data on a lane change scenario, and the results show that the proposed algorithm can successfully finish the lane changes without collisions.

RONov 24, 2020
Prediction-Based Reachability for Collision Avoidance in Autonomous Driving

Anjian Li, Liting Sun, Wei Zhan et al.

Safety is an important topic in autonomous driving since any collision may cause serious injury to people and damage to property. Hamilton-Jacobi (HJ) Reachability is a formal method that verifies safety in multi-agent interaction and provides a safety controller for collision avoidance. However, due to the worst-case assumption on the cars future behaviours, reachability might result in too much conservatism such that the normal operation of the vehicle is badly hindered. In this paper, we leverage the power of trajectory prediction and propose a prediction-based reachability framework to compute safety controllers. Instead of always assuming the worst case, we cluster the car's behaviors into multiple driving modes, e.g. left turn or right turn. Under each mode, a reachability-based safety controller is designed based on a less conservative action set. For online implementation, we first utilize the trajectory prediction and our proposed mode classifier to predict the possible modes, and then deploy the corresponding safety controller. Through simulations in a T-intersection and an 8-way roundabout, we demonstrate that our prediction-based reachability method largely avoids collision between two interacting cars and reduces the conservatism that the safety controller brings to the car's original operation.

AINov 4, 2020
IDE-Net: Interactive Driving Event and Pattern Extraction from Human Data

Xiaosong Jia, Liting Sun, Masayoshi Tomizuka et al.

Autonomous vehicles (AVs) need to share the road with multiple, heterogeneous road users in a variety of driving scenarios. It is overwhelming and unnecessary to carefully interact with all observed agents, and AVs need to determine whether and when to interact with each surrounding agent. In order to facilitate the design and testing of prediction and planning modules of AVs, in-depth understanding of interactive behavior is expected with proper representation, and events in behavior data need to be extracted and categorized automatically. Answers to what are the essential patterns of interactions are also crucial for these motivations in addition to answering whether and when. Thus, learning to extract interactive driving events and patterns from human data for tackling the whether-when-what tasks is of critical importance for AVs. There is, however, no clear definition and taxonomy of interactive behavior, and most of the existing works are based on either manual labelling or hand-crafted rules and features. In this paper, we propose the Interactive Driving event and pattern Extraction Network (IDE-Net), which is a deep learning framework to automatically extract interaction events and patterns directly from vehicle trajectories. In IDE-Net, we leverage the power of multi-task learning and proposed three auxiliary tasks to assist the pattern extraction in an unsupervised fashion. We also design a unique spatial-temporal block to encode the trajectory data. Experimental results on the INTERACTION dataset verified the effectiveness of such designs in terms of better generalizability and effective pattern extraction. We find three interpretable patterns of interactions, bringing insights for driver behavior representation, modeling and comprehension. Both objective and subjective evaluation metrics are adopted in our analysis of the learned patterns.

ROOct 28, 2020
Socially-Compatible Behavior Design of Autonomous Vehicles with Verification on Real Human Data

Letian Wang, Liting Sun, Masayoshi Tomizuka et al.

As more and more autonomous vehicles (AVs) are being deployed on public roads, designing socially compatible behaviors for them is becoming increasingly important. In order to generate safe and efficient actions, AVs need to not only predict the future behaviors of other traffic participants, but also be aware of the uncertainties associated with such behavior prediction. In this paper, we propose an uncertain-aware integrated prediction and planning (UAPP) framework. It allows the AVs to infer the characteristics of other road users online and generate behaviors optimizing not only their own rewards, but also their courtesy to others, and their confidence regarding the prediction uncertainties. We first propose the definitions for courtesy and confidence. Based on that, their influences on the behaviors of AVs in interactive driving scenarios are explored. Moreover, we evaluate the proposed algorithm on naturalistic human driving data by comparing the generated behavior against ground truth. Results show that the online inference can significantly improve the human-likeness of the generated behaviors. Furthermore, we find that human drivers show great courtesy to others, even for those without right-of-way. We also find that such driving preferences vary significantly in different cultures.

LGSep 3, 2020
Bounded Risk-Sensitive Markov Games: Forward Policy Design and Inverse Reward Learning with Iterative Reasoning and Cumulative Prospect Theory

Ran Tian, Liting Sun, Masayoshi Tomizuka

Classical game-theoretic approaches for multi-agent systems in both the forward policy design problem and the inverse reward learning problem often make strong rationality assumptions: agents perfectly maximize expected utilities under uncertainties. Such assumptions, however, substantially mismatch with observed humans' behaviors such as satisficing with sub-optimal, risk-seeking, and loss-aversion decisions. In this paper, we investigate the problem of bounded risk-sensitive Markov Game (BRSMG) and its inverse reward learning problem for modeling human realistic behaviors and learning human behavioral models. Drawing on iterative reasoning models and cumulative prospect theory, we embrace that humans have bounded intelligence and maximize risk-sensitive utilities in BRSMGs. Convergence analysis for both the forward policy design and the inverse reward learning problems are established under the BRSMG framework. We validate the proposed forward policy design and inverse reward learning algorithms in a navigation scenario. The results show that the behaviors of agents demonstrate both risk-averse and risk-seeking characteristics. Moreover, in the inverse reward learning task, the proposed bounded risk-sensitive inverse learning algorithm outperforms a baseline risk-neutral inverse learning algorithm by effectively recovering not only more accurate reward values but also the intelligence levels and the risk-measure parameters given demonstrations of agents' interactive behaviors.

ROAug 20, 2020
Expressing Diverse Human Driving Behavior with Probabilistic Rewards and Online Inference

Liting Sun, Zheng Wu, Hengbo Ma et al.

In human-robot interaction (HRI) systems, such as autonomous vehicles, understanding and representing human behavior are important. Human behavior is naturally rich and diverse. Cost/reward learning, as an efficient way to learn and represent human behavior, has been successfully applied in many domains. Most of traditional inverse reinforcement learning (IRL) algorithms, however, cannot adequately capture the diversity of human behavior since they assume that all behavior in a given dataset is generated by a single cost function.In this paper, we propose a probabilistic IRL framework that directly learns a distribution of cost functions in continuous domain. Evaluations on both synthetic data and real human driving data are conducted. Both the quantitative and subjective results show that our proposed framework can better express diverse human driving behaviors, as well as extracting different driving styles that match what human participants interpret in our user study.

ROJun 22, 2020
Efficient Sampling-Based Maximum Entropy Inverse Reinforcement Learning with Application to Autonomous Driving

Zheng Wu, Liting Sun, Wei Zhan et al.

In the past decades, we have witnessed significant progress in the domain of autonomous driving. Advanced techniques based on optimization and reinforcement learning (RL) become increasingly powerful at solving the forward problem: given designed reward/cost functions, how should we optimize them and obtain driving policies that interact with the environment safely and efficiently. Such progress has raised another equally important question: \emph{what should we optimize}? Instead of manually specifying the reward functions, it is desired that we can extract what human drivers try to optimize from real traffic data and assign that to autonomous vehicles to enable more naturalistic and transparent interaction between humans and intelligent agents. To address this issue, we present an efficient sampling-based maximum-entropy inverse reinforcement learning (IRL) algorithm in this paper. Different from existing IRL algorithms, by introducing an efficient continuous-domain trajectory sampler, the proposed algorithm can directly learn the reward functions in the continuous domain while considering the uncertainties in demonstrated trajectories from human drivers. We evaluate the proposed algorithm on real driving data, including both non-interactive and interactive scenarios. The experimental results show that the proposed algorithm achieves more accurate prediction performance with faster convergence speed and better generalization compared to other baseline IRL algorithms.

ROJan 27, 2020
Experimental Evaluation of Human Motion Prediction: Toward Safe and Efficient Human Robot Collaboration

Weiye Zhao, Liting Sun, Changliu Liu et al.

Human motion prediction is non-trivial in modern industrial settings. Accurate prediction of human motion can not only improve efficiency in human robot collaboration, but also enhance human safety in close proximity to robots. Among existing prediction models, the parameterization and identification methods of those models vary. It remains unclear what is the necessary parameterization of a prediction model, whether online adaptation of the model is necessary, and whether prediction can help improve safety and efficiency during human robot collaboration. These problems result from the difficulty to quantitatively evaluate various prediction models in a closed-loop fashion in real human-robot interaction settings. This paper develops a method to evaluate the closed-loop performance of different prediction models. In particular, we compare models with different parameterizations and models with or without online parameter adaptation. Extensive experiments were conducted on a human robot collaboration platform. The experimental results demonstrated that human motion prediction significantly enhanced the collaboration efficiency and human safety. Adaptable prediction models that were parameterized by neural networks achieved the best performance.

ROOct 22, 2019
Multiple criteria decision-making for lane-change model

Ao Li, Liting Sun, Wei Zhan et al.

Simulation has long been an essential part of testing autonomous driving systems, but only recently has simulation been useful for building and training self-driving vehicles. Vehicle behavioural models are necessary to simulate the interactions between robot cars. This paper proposed a new method to formalize the lane-changing model in urban driving scenarios. We define human incentives from different perspectives, speed incentive, route change incentive, comfort incentive and courtesy incentive etc. We applied a decision-theoretical tool, called Multi-Criteria Decision Making (MCDM) to take these incentive policies into account. The strategy of combination is according to different driving style which varies for each driving. Thus a lane-changing decision selection algorithm is proposed. Not only our method allows for varying the motivation of lane-changing from the purely egoistic desire to a more courtesy concern, but also they can mimic drivers' state, inattentive or concentrate, which influences their driving Behaviour. We define some cost functions and calibrate the parameters with different scenarios of traffic data. Distinguishing driving styles are used to aggregate decision-makers' assessments about various criteria weightings to obtain the action drivers desire most. Our result demonstrates the proposed method can produce varied lane-changing behaviour. Unlike other lane-changing models based on artificial intelligence methods, our model has more flexible controllability.

ROSep 30, 2019
INTERACTION Dataset: An INTERnational, Adversarial and Cooperative moTION Dataset in Interactive Driving Scenarios with Semantic Maps

Wei Zhan, Liting Sun, Di Wang et al.

Behavior-related research areas such as motion prediction/planning, representation/imitation learning, behavior modeling/generation, and algorithm testing, require support from high-quality motion datasets containing interactive driving scenarios with different driving cultures. In this paper, we present an INTERnational, Adversarial and Cooperative moTION dataset (INTERACTION dataset) in interactive driving scenarios with semantic maps. Five features of the dataset are highlighted. 1) The interactive driving scenarios are diverse, including urban/highway/ramp merging and lane changes, roundabouts with yield/stop signs, signalized intersections, intersections with one/two/all-way stops, etc. 2) Motion data from different countries and different continents are collected so that driving preferences and styles in different cultures are naturally included. 3) The driving behavior is highly interactive and complex with adversarial and cooperative motions of various traffic participants. Highly complex behavior such as negotiations, aggressive/irrational decisions and traffic rule violations are densely contained in the dataset, while regular behavior can also be found from cautious car-following, stop, left/right/U-turn to rational lane-change and cycling and pedestrian crossing, etc. 4) The levels of criticality span wide, from regular safe operations to dangerous, near-collision maneuvers. Real collision, although relatively slight, is also included. 5) Maps with complete semantic information are provided with physical layers, reference lines, lanelet connections and traffic rules. The data is recorded from drones and traffic cameras. Statistics of the dataset in terms of number of entities and interaction density are also provided, along with some utilization examples in a variety of behavior-related research areas. The dataset can be downloaded via https://interaction-dataset.com.

ROJul 23, 2019
Generic Prediction Architecture Considering both Rational and Irrational Driving Behaviors

Yeping Hu, Liting Sun, Masayoshi Tomizuka

Accurately predicting future behaviors of surrounding vehicles is an essential capability for autonomous vehicles in order to plan safe and feasible trajectories. The behaviors of others, however, are full of uncertainties. Both rational and irrational behaviors exist, and the autonomous vehicles need to be aware of this in their prediction module. The prediction module is also expected to generate reasonable results in the presence of unseen and corner scenarios. Two types of prediction models are typically used to solve the prediction problem: learning-based model and planning-based model. Learning-based model utilizes real driving data to model the human behaviors. Depending on the structure of the data, learning-based models can predict both rational and irrational behaviors. But the balance between them cannot be customized, which creates challenges in generalizing the prediction results. Planning-based model, on the other hand, usually assumes human as a rational agent, i.e., it anticipates only rational behavior of human drivers. In this paper, a generic prediction architecture is proposed to address various rationalities in human behavior. We leverage the advantages from both learning-based and planning-based prediction models. The proposed approach is able to predict continuous trajectories that well-reflect possible future situations of other drivers. Moreover, the prediction performance remains stable under various unseen driving scenarios. A case study under a real-world roundabout scenario is provided to demonstrate the performance and capability of the proposed prediction architecture.

AIJul 19, 2019
Interpretable Modelling of Driving Behaviors in Interactive Driving Scenarios based on Cumulative Prospect Theory

Liting Sun, Wei Zhan, Yeping Hu et al.

Understanding human driving behavior is important for autonomous vehicles. In this paper, we propose an interpretable human behavior model in interactive driving scenarios based on the cumulative prospect theory (CPT). As a non-expected utility theory, CPT can well explain some systematically biased or ``irrational'' behavior/decisions of human that cannot be explained by the expected utility theory. Hence, the goal of this work is to formulate the human drivers' behavior generation model with CPT so that some ``irrational'' behavior or decisions of human can be better captured and predicted. Towards such a goal, we first develop a CPT-driven decision-making model focusing on driving scenarios with two interacting agents. A hierarchical learning algorithm is proposed afterward to learn the utility function, the value function, and the decision weighting function in the CPT model. A case study for roundabout merging is also provided as verification. With real driving data, the prediction performances of three different models are compared: a predefined model based on time-to-collision (TTC), a learning-based model based on neural networks, and the proposed CPT-based model. The results show that the proposed model outperforms the TTC model and achieves similar performance as the learning-based model with much less training data and better interpretability.

ROMay 2, 2019
Behavior Planning of Autonomous Cars with Social Perception

Liting Sun, Wei Zhan, Ching-Yao Chan et al.

Autonomous cars have to navigate in dynamic environment which can be full of uncertainties. The uncertainties can come either from sensor limitations such as occlusions and limited sensor range, or from probabilistic prediction of other road participants, or from unknown social behavior in a new area. To safely and efficiently drive in the presence of these uncertainties, the decision-making and planning modules of autonomous cars should intelligently utilize all available information and appropriately tackle the uncertainties so that proper driving strategies can be generated. In this paper, we propose a social perception scheme which treats all road participants as distributed sensors in a sensor network. By observing the individual behaviors as well as the group behaviors, uncertainties of the three types can be updated uniformly in a belief space. The updated beliefs from the social perception are then explicitly incorporated into a probabilistic planning framework based on Model Predictive Control (MPC). The cost function of the MPC is learned via inverse reinforcement learning (IRL). Such an integrated probabilistic planning module with socially enhanced perception enables the autonomous vehicles to generate behaviors which are defensive but not overly conservative, and socially compatible. The effectiveness of the proposed framework is verified in simulation on an representative scenario with sensor occlusions.

LGMar 22, 2019
Multi-modal Probabilistic Prediction of Interactive Behavior via an Interpretable Model

Yeping Hu, Wei Zhan, Liting Sun et al.

For autonomous agents to successfully operate in real world, the ability to anticipate future motions of surrounding entities in the scene can greatly enhance their safety levels since potentially dangerous situations could be avoided in advance. While impressive results have been shown on predicting each agent's behavior independently, we argue that it is not valid to consider road entities individually since transitions of vehicle states are highly coupled. Moreover, as the predicted horizon becomes longer, modeling prediction uncertainties and multi-modal distributions over future sequences will turn into a more challenging task. In this paper, we address this challenge by presenting a multi-modal probabilistic prediction approach. The proposed method is based on a generative model and is capable of jointly predicting sequential motions of each pair of interacting agents. Most importantly, our model is interpretable, which can explain the underneath logic as well as obtain more reliability to use in real applications. A complicate real-world roundabout scenario is utilized to implement and examine the proposed method.

ROMar 6, 2019
Towards Better Human Robot Collaboration with Robust Plan Recognition and Trajectory Prediction

Yujiao Cheng, Liting Sun, Changliu Liu et al.

Human robot collaboration (HRC) is becoming increasingly important as the paradigm of manufacturing is shifting from mass production to mass customization. The introduction of HRC can significantly improve the flexibility and intelligence of automation. However, due to the stochastic and time-varying nature of human collaborators, it is challenging for the robot to efficiently and accurately identify the plan of human and respond in a safe manner. To address this challenge, we propose an integrated human robot collaboration framework in this paper which includes both plan recognition and trajectory prediction. Such a framework enables the robots to perceive, predict and adapt their actions to the human's plan and intelligently avoid collisions with the human based on the predicted human trajectory. Moreover, by explicitly leveraging the hierarchical relationship between the plan and trajectories, more robust plan recognition performance can be achieved. Experiments are conducted on an industrial robot to verify the proposed framework, which shows that our proposed framework can not only assure safe HRC, but also improve the time efficiency of the HRC team, and the plan recognition module is not sensitive to noises.

ROSep 10, 2018
Towards a Fatality-Aware Benchmark of Probabilistic Reaction Prediction in Highly Interactive Driving Scenarios

Wei Zhan, Liting Sun, Yeping Hu et al.

Autonomous vehicles should be able to generate accurate probabilistic predictions for uncertain behavior of other road users. Moreover, reactive predictions are necessary in highly interactive driving scenarios to answer "what if I take this action in the future" for autonomous vehicles. There is no existing unified framework to homogenize the problem formulation, representation simplification, and evaluation metric for various prediction methods, such as probabilistic graphical models (PGM), neural networks (NN) and inverse reinforcement learning (IRL). In this paper, we formulate a probabilistic reaction prediction problem, and reveal the relationship between reaction and situation prediction problems. We employ prototype trajectories with designated motion patterns other than "intention" to homogenize the representation so that probabilities corresponding to each trajectory generated by different methods can be evaluated. We also discuss the reasons why "intention" is not suitable to serve as a motion indicator in highly interactive scenarios. We propose to use Brier score as the baseline metric for evaluation. In order to reveal the fatality of the consequences when the predictions are adopted by decision-making and planning, we propose a fatality-aware metric, which is a weighted Brier score based on the criticality of the trajectory pairs of the interacting entities. Conservatism and non-defensiveness are defined from the weighted Brier score to indicate the consequences caused by inaccurate predictions. Modified methods based on PGM, NN and IRL are provided to generate probabilistic reaction predictions in an exemplar scenario of nudging from a highway ramp. The results are evaluated by the baseline and proposed metrics to construct a mini benchmark. Analysis on the properties of each method is also provided by comparing the baseline and proposed metric scores.

LGSep 9, 2018
Probabilistic Prediction of Interactive Driving Behavior via Hierarchical Inverse Reinforcement Learning

Liting Sun, Wei Zhan, Masayoshi Tomizuka

Autonomous vehicles (AVs) are on the road. To safely and efficiently interact with other road participants, AVs have to accurately predict the behavior of surrounding vehicles and plan accordingly. Such prediction should be probabilistic, to address the uncertainties in human behavior. Such prediction should also be interactive, since the distribution over all possible trajectories of the predicted vehicle depends not only on historical information, but also on future plans of other vehicles that interact with it. To achieve such interaction-aware predictions, we propose a probabilistic prediction approach based on hierarchical inverse reinforcement learning (IRL). First, we explicitly consider the hierarchical trajectory-generation process of human drivers involving both discrete and continuous driving decisions. Based on this, the distribution over all future trajectories of the predicted vehicle is formulated as a mixture of distributions partitioned by the discrete decisions. Then we apply IRL hierarchically to learn the distributions from real human demonstrations. A case study for the ramp-merging driving scenario is provided. The quantitative results show that the proposed approach can accurately predict both the discrete driving decisions such as yield or pass as well as the continuous trajectories.

ROAug 8, 2018
Courteous Autonomous Cars

Liting Sun, Wei Zhan, Masayoshi Tomizuka et al.

Typically, autonomous cars optimize for a combination of safety, efficiency, and driving quality. But as we get better at this optimization, we start seeing behavior go from too conservative to too aggressive. The car's behavior exposes the incentives we provide in its cost function. In this work, we argue for cars that are not optimizing a purely selfish cost, but also try to be courteous to other interactive drivers. We formalize courtesy as a term in the objective that measures the increase in another driver's cost induced by the autonomous car's behavior. Such a courtesy term enables the robot car to be aware of possible irrationality of the human behavior, and plan accordingly. We analyze the effect of courtesy in a variety of scenarios. We find, for example, that courteous robot cars leave more space when merging in front of a human driver. Moreover, we find that such a courtesy term can help explain real human driver behavior on the NGSIM dataset.

AIJul 9, 2017
A Fast Integrated Planning and Control Framework for Autonomous Driving via Imitation Learning

Liting Sun, Cheng Peng, Wei Zhan et al.

For safe and efficient planning and control in autonomous driving, we need a driving policy which can achieve desirable driving quality in long-term horizon with guaranteed safety and feasibility. Optimization-based approaches, such as Model Predictive Control (MPC), can provide such optimal policies, but their computational complexity is generally unacceptable for real-time implementation. To address this problem, we propose a fast integrated planning and control framework that combines learning- and optimization-based approaches in a two-layer hierarchical structure. The first layer, defined as the "policy layer", is established by a neural network which learns the long-term optimal driving policy generated by MPC. The second layer, called the "execution layer", is a short-term optimization-based controller that tracks the reference trajecotries given by the "policy layer" with guaranteed short-term safety and feasibility. Moreover, with efficient and highly-representative features, a small-size neural network is sufficient in the "policy layer" to handle many complicated driving scenarios. This renders online imitation learning with Dataset Aggregation (DAgger) so that the performance of the "policy layer" can be improved rapidly and continuously online. Several exampled driving scenarios are demonstrated to verify the effectiveness and efficiency of the proposed framework.