Ching-Yao Chan

RO
h-index13
18papers
1,071citations
Novelty44%
AI Score28

18 Papers

SYMay 19, 2019Code
A Reinforcement Learning Approach for Intelligent Traffic Signal Control at Urban Intersections

Mengyu Guo, Pin Wang, Ching-Yao Chan et al.

Ineffective and inflexible traffic signal control at urban intersections can often lead to bottlenecks in traffic flows and cause congestion, delay, and environmental problems. How to manage traffic smartly by intelligent signal control is a significant challenge in urban traffic management. With recent advances in machine learning, especially reinforcement learning (RL), traffic signal control using advanced machine learning techniques represents a promising solution to tackle this problem. In this paper, we propose a RL approach for traffic signal control at urban intersections. Specifically, we use neural networks as Q-function approximator (a.k.a. Q-network) to deal with the complex traffic signal control problem where the state space is large and the action space can be discrete. The state space is defined based on real-time traffic information, i.e. vehicle position, direction and speed. The action space includes various traffic signal phases which are critical in generating a reasonable and realistic control mechanism, given the prominent spatial-temporal characteristics of urban traffic. In the simulation experiment, we use SUMO, an open source traffic simulator, to construct realistic urban intersection settings. Moreover, we use different traffic patterns, such as major/minor road traffic, through/left-turn lane traffic, tidal traffic, and varying demand traffic, to train a generalized traffic signal control model that can be adapted to various traffic conditions. The simulation results demonstrate the convergence and generalization performance of our RL approach as well as its significant benefits in terms of queue length and wait time over several benchmarking methods in traffic signal control.

CVDec 22, 2023
Learning Socio-Temporal Graphs for Multi-Agent Trajectory Prediction

Yuke Li, Lixiong Chen, Guangyi Chen et al.

In order to predict a pedestrian's trajectory in a crowd accurately, one has to take into account her/his underlying socio-temporal interactions with other pedestrians consistently. Unlike existing work that represents the relevant information separately, partially, or implicitly, we propose a complete representation for it to be fully and explicitly captured and analyzed. In particular, we introduce a Directed Acyclic Graph-based structure, which we term Socio-Temporal Graph (STG), to explicitly capture pair-wise socio-temporal interactions among a group of people across both space and time. Our model is built on a time-varying generative process, whose latent variables determine the structure of the STGs. We design an attention-based model named STGformer that affords an end-to-end pipeline to learn the structure of the STGs for trajectory prediction. Our solution achieves overall state-of-the-art prediction accuracy in two large-scale benchmark datasets. Our analysis shows that a person's past trajectory is critical for predicting another person's future path. Our model learns this relationship with a strong notion of socio-temporal localities. Statistics show that utilizing this information explicitly for prediction yields a noticeable performance gain with respect to the trajectory-only approaches.

ROMay 29, 2021
A Survey of Deep Reinforcement Learning Algorithms for Motion Planning and Control of Autonomous Vehicles

Fei Ye, Shen Zhang, Pin Wang et al.

In this survey, we systematically summarize the current literature on studies that apply reinforcement learning (RL) to the motion planning and control of autonomous vehicles. Many existing contributions can be attributed to the pipeline approach, which consists of many hand-crafted modules, each with a functionality selected for the ease of human interpretation. However, this approach does not automatically guarantee maximal performance due to the lack of a system-level optimization. Therefore, this paper also presents a growing trend of work that falls into the end-to-end approach, which typically offers better performance and smaller system scales. However, their performance also suffers from the lack of expert data and generalization issues. Finally, the remaining challenges applying deep RL algorithms on autonomous driving are summarized, and future research directions are also presented to tackle these challenges.

LGMar 23, 2021
Meta-Adversarial Inverse Reinforcement Learning for Decision-making Tasks

Pin Wang, Hanhan Li, Ching-Yao Chan

Learning from demonstrations has made great progress over the past few years. However, it is generally data hungry and task specific. In other words, it requires a large amount of data to train a decent model on a particular task, and the model often fails to generalize to new tasks that have a different distribution. In practice, demonstrations from new tasks will be continuously observed and the data might be unlabeled or only partially labeled. Therefore, it is desirable for the trained model to adapt to new tasks that have limited data samples available. In this work, we build an adaptable imitation learning model based on the integration of Meta-learning and Adversarial Inverse Reinforcement Learning (Meta-AIRL). We exploit the adversarial learning and inverse reinforcement learning mechanisms to learn policies and reward functions simultaneously from available training tasks and then adapt them to new tasks with the meta-learning framework. Simulation results show that the adapted policy trained with Meta-AIRL can effectively learn from limited number of demonstrations, and quickly reach the performance comparable to that of the experts on unseen tasks.

LGAug 28, 2020
Meta Reinforcement Learning-Based Lane Change Strategy for Autonomous Vehicles

Fei Ye, Pin Wang, Ching-Yao Chan et al.

Recent advances in supervised learning and reinforcement learning have provided new opportunities to apply related methodologies to automated driving. However, there are still challenges to achieve automated driving maneuvers in dynamically changing environments. Supervised learning algorithms such as imitation learning can generalize to new environments by training on a large amount of labeled data, however, it can be often impractical or cost-prohibitive to obtain sufficient data for each new environment. Although reinforcement learning methods can mitigate this data-dependency issue by training the agent in a trial-and-error way, they still need to re-train policies from scratch when adapting to new environments. In this paper, we thus propose a meta reinforcement learning (MRL) method to improve the agent's generalization capabilities to make automated lane-changing maneuvers at different traffic environments, which are formulated as different traffic congestion levels. Specifically, we train the model at light to moderate traffic densities and test it at a new heavy traffic density condition. We use both collision rate and success rate to quantify the safety and effectiveness of the proposed model. A benchmark model is developed based on a pretraining method, which uses the same network structure and training tasks as our proposed model for fair comparison. The simulation results shows that the proposed method achieves an overall success rate up to 20% higher than the benchmark model when it is generalized to the new environment of heavy traffic density. The collision rate is also reduced by up to 18% than the benchmark model. Finally, the proposed model shows more stable and efficient generalization capabilities adapting to the new environment, and it can achieve 100% successful rate and 0% collision rate with only a few steps of gradient updates.

LGFeb 7, 2020
Automated Lane Change Strategy using Proximal Policy Optimization-based Deep Reinforcement Learning

Fei Ye, Xuxin Cheng, Pin Wang et al.

Lane-change maneuvers are commonly executed by drivers to follow a certain routing plan, overtake a slower vehicle, adapt to a merging lane ahead, etc. However, improper lane change behaviors can be a major cause of traffic flow disruptions and even crashes. While many rule-based methods have been proposed to solve lane change problems for autonomous driving, they tend to exhibit limited performance due to the uncertainty and complexity of the driving environment. Machine learning-based methods offer an alternative approach, as Deep reinforcement learning (DRL) has shown promising success in many application domains including robotic manipulation, navigation, and playing video games. However, applying DRL to autonomous driving still faces many practical challenges in terms of slow learning rates, sample inefficiency, and safety concerns. In this study, we propose an automated lane change strategy using proximal policy optimization-based deep reinforcement learning, which shows great advantages in learning efficiency while still maintaining stable performance. The trained agent is able to learn a smooth, safe, and efficient driving policy to make lane-change decisions (i.e. when and how) in a challenging situation such as dense traffic scenarios. The effectiveness of the proposed policy is validated by using metrics of task success rate and collision rate. The simulation results demonstrate the lane change maneuvers can be efficiently learned and executed in a safe, smooth, and efficient manner.

CVFeb 4, 2020
TPPO: A Novel Trajectory Predictor with Pseudo Oracle

Biao Yang, Caizhen He, Pin Wang et al.

Forecasting pedestrian trajectories in dynamic scenes remains a critical problem in various applications, such as autonomous driving and socially aware robots. Such forecasting is challenging due to human-human and human-object interactions and future uncertainties caused by human randomness. Generative model-based methods handle future uncertainties by sampling a latent variable. However, few studies explored the generation of the latent variable. In this work, we propose the Trajectory Predictor with Pseudo Oracle (TPPO), which is a generative model-based trajectory predictor. The first pseudo oracle is pedestrians' moving directions, and the second one is the latent variable estimated from ground truth trajectories. A social attention module is used to aggregate neighbors' interactions based on the correlation between pedestrians' moving directions and future trajectories. This correlation is inspired by the fact that pedestrians' future trajectories are often influenced by pedestrians in front. A latent variable predictor is proposed to estimate latent variable distributions from observed and ground-truth trajectories. Moreover, the gap between these two distributions is minimized during training. Therefore, the latent variable predictor can estimate the latent variable from observed trajectories to approximate that estimated from ground-truth trajectories. We compare the performance of TPPO with related methods on several public datasets. Results demonstrate that TPPO outperforms state-of-the-art methods with low average and final displacement errors. The ablation study shows that the prediction performance will not dramatically decrease as sampling times decline during tests.

LGNov 29, 2019
Quadratic Q-network for Learning Continuous Control for Autonomous Vehicles

Pin Wang, Hanhan Li, Ching-Yao Chan

Reinforcement Learning algorithms have recently been proposed to learn time-sequential control policies in the field of autonomous driving. Direct applications of Reinforcement Learning algorithms with discrete action space will yield unsatisfactory results at the operational level of driving where continuous control actions are actually required. In addition, the design of neural networks often fails to incorporate the domain knowledge of the targeting problem such as the classical control theories in our case. In this paper, we propose a hybrid model by combining Q-learning and classic PID (Proportion Integration Differentiation) controller for handling continuous vehicle control problems under dynamic driving environment. Particularly, instead of using a big neural network as Q-function approximation, we design a Quadratic Q-function over actions with multiple simple neural networks for finding optimal values within a continuous space. We also build an action network based on the domain knowledge of the control mechanism of a PID controller to guide the agent to explore optimal actions more efficiently.We test our proposed approach in simulation under two common but challenging driving situations, the lane change scenario and ramp merge scenario. Results show that the autonomous vehicle agent can successfully learn a smooth and efficient driving behavior in both situations.

AINov 19, 2019
Decision Making for Autonomous Driving via Augmented Adversarial Inverse Reinforcement Learning

Pin Wang, Dapeng Liu, Jiayu Chen et al.

Making decisions in complex driving environments is a challenging task for autonomous agents. Imitation learning methods have great potentials for achieving such a goal. Adversarial Inverse Reinforcement Learning (AIRL) is one of the state-of-art imitation learning methods that can learn both a behavioral policy and a reward function simultaneously, yet it is only demonstrated in simple and static environments where no interactions are introduced. In this paper, we improve and stabilize AIRL's performance by augmenting it with semantic rewards in the learning framework. Additionally, we adapt the augmented AIRL to a more practical and challenging decision-making task in a highly interactive environment in autonomous driving. The proposed method is compared with four baselines and evaluated by four performance metrics. Simulation results show that the augmented AIRL outperforms all the baseline methods, and its performance is comparable with that of the experts on all of the four metrics.

LGJun 6, 2019
Intention-aware Long Horizon Trajectory Prediction of Surrounding Vehicles using Dual LSTM Networks

Long Xin, Pin Wang, Ching-Yao Chan et al.

As autonomous vehicles (AVs) need to interact with other road users, it is of importance to comprehensively understand the dynamic traffic environment, especially the future possible trajectories of surrounding vehicles. This paper presents an algorithm for long-horizon trajectory prediction of surrounding vehicles using a dual long short term memory (LSTM) network, which is capable of effectively improving prediction accuracy in strongly interactive driving environments. In contrast to traditional approaches which require trajectory matching and manual feature selection, this method can automatically learn high-level spatial-temporal features of driver behaviors from naturalistic driving data through sequence learning. By employing two blocks of LSTMs, the proposed method feeds the sequential trajectory to the first LSTM for driver intention recognition as an intermediate indicator, which is immediately followed by a second LSTM for future trajectory prediction. Test results from real-world highway driving data show that the proposed method can, in comparison to state-of-art methods, output more accurate and reasonable estimate of different future trajectories over 5s time horizon with root mean square error (RMSE) for longitudinal and lateral prediction less than 5.77m and 0.49m, respectively.

ROJun 5, 2019
Continuous Control for Automated Lane Change Behavior Based on Deep Deterministic Policy Gradient Algorithm

Pin Wang, Hanhan Li, Ching-Yao Chan

Lane change is a challenging task which requires delicate actions to ensure safety and comfort. Some recent studies have attempted to solve the lane-change control problem with Reinforcement Learning (RL), yet the action is confined to discrete action space. To overcome this limitation, we formulate the lane change behavior with continuous action in a model-free dynamic driving environment based on Deep Deterministic Policy Gradient (DDPG). The reward function, which is critical for learning the optimal policy, is defined by control values, position deviation status, and maneuvering time to provide the RL agent informative signals. The RL agent is trained from scratch without resorting to any prior knowledge of the environment and vehicle dynamics since they are not easy to obtain. Seven models under different hyperparameter settings are compared. A video showing the learning progress of the driving behavior is available. It demonstrates the RL vehicle agent initially runs out of road boundary frequently, but eventually has managed to smoothly and stably change to the target lane with a success rate of 100% under diverse driving situations in simulation.

ROMay 2, 2019
Behavior Planning of Autonomous Cars with Social Perception

Liting Sun, Wei Zhan, Ching-Yao Chan et al.

Autonomous cars have to navigate in dynamic environment which can be full of uncertainties. The uncertainties can come either from sensor limitations such as occlusions and limited sensor range, or from probabilistic prediction of other road participants, or from unknown social behavior in a new area. To safely and efficiently drive in the presence of these uncertainties, the decision-making and planning modules of autonomous cars should intelligently utilize all available information and appropriately tackle the uncertainties so that proper driving strategies can be generated. In this paper, we propose a social perception scheme which treats all road participants as distributed sensors in a sensor network. By observing the individual behaviors as well as the group behaviors, uncertainties of the three types can be updated uniformly in a belief space. The updated beliefs from the social perception are then explicitly incorporated into a probabilistic planning framework based on Model Predictive Control (MPC). The cost function of the MPC is learned via inverse reinforcement learning (IRL). Such an integrated probabilistic planning module with socially enhanced perception enables the autonomous vehicles to generate behaviors which are defensive but not overly conservative, and socially compatible. The effectiveness of the proposed framework is verified in simulation on an representative scenario with sensor occlusions.

ROApr 23, 2019
Driving Decision and Control for Autonomous Lane Change based on Deep Reinforcement Learning

Tianyu Shi, Pin Wang, Xuxin Cheng et al.

We apply Deep Q-network (DQN) with the consideration of safety during the task for deciding whether to conduct the maneuver. Furthermore, we design two similar Deep Q learning frameworks with quadratic approximator for deciding how to select a comfortable gap and just follow the preceding vehicle. Finally, a polynomial lane change trajectory is generated and Pure Pursuit Control is implemented for path tracking. We demonstrate the effectiveness of this framework in simulation, from both the decision-making and control layers. The proposed architecture also has the potential to be extended to other autonomous driving scenarios.

ROJan 31, 2019
A Data Driven Method of Optimizing Feedforward Compensator for Autonomous Vehicle

Tianyu Shi, Pin Wang, Ching-Yao Chan et al.

A reliable controller is critical and essential for the execution of safe and smooth maneuvers of an autonomous vehicle.The controller must be robust to external disturbances, such as road surface, weather, and wind conditions, and so on.It also needs to deal with the internal parametric variations of vehicle sub-systems, including power-train efficiency, measurement errors, time delay,so on.Moreover, as in most production vehicles, the low-control commands for the engine, brake, and steering systems are delivered through separate electronic control units.These aforementioned factors introduce opaque and ineffectiveness issues in controller performance.In this paper, we design a feed-forward compensate process via a data-driven method to model and further optimize the controller performance.We apply the principal component analysis to the extraction of most influential features.Subsequently,we adopt a time delay neural network and include the accuracy of the predicted error in a future time horizon.Utilizing the predicted error,we then design a feed-forward compensate process to improve the control performance.Finally,we demonstrate the effectiveness of the proposed feed-forward compensate process in simulation scenarios.

ROApr 21, 2018
A Reinforcement Learning Based Approach for Automated Lane Change Maneuvers

Pin Wang, Ching-Yao Chan, Arnaud de La Fortelle

Lane change is a crucial vehicle maneuver which needs coordination with surrounding vehicles. Automated lane changing functions built on rule-based models may perform well under pre-defined operating conditions, but they may be prone to failure when unexpected situations are encountered. In our study, we proposed a Reinforcement Learning based approach to train the vehicle agent to learn an automated lane change behavior such that it can intelligently make a lane change under diverse and even unforeseen scenarios. Particularly, we treated both state space and action space as continuous, and designed a Q-function approximator that has a closed- form greedy policy, which contributes to the computation efficiency of our deep Q-learning algorithm. Extensive simulations are conducted for training the algorithm, and the results illustrate that the Reinforcement Learning based vehicle agent is capable of learning a smooth and efficient driving policy for lane change maneuvers.

AIMar 25, 2018
Autonomous Ramp Merge Maneuver Based on Reinforcement Learning with Continuous Action Space

Pin Wang, Ching-Yao Chan

Ramp merging is a critical maneuver for road safety and traffic efficiency. Most of the current automated driving systems developed by multiple automobile manufacturers and suppliers are typically limited to restricted access freeways only. Extending the automated mode to ramp merging zones presents substantial challenges. One is that the automated vehicle needs to incorporate a future objective (e.g. a successful and smooth merge) and optimize a long-term reward that is impacted by subsequent actions when executing the current action. Furthermore, the merging process involves interaction between the merging vehicle and its surrounding vehicles whose behavior may be cooperative or adversarial, leading to distinct merging countermeasures that are crucial to successfully complete the merge. In place of the conventional rule-based approaches, we propose to apply reinforcement learning algorithm on the automated vehicle agent to find an optimal driving policy by maximizing the long-term reward in an interactive driving environment. Most importantly, in contrast to most reinforcement learning applications in which the action space is resolved as discrete, our approach treats the action space as well as the state space as continuous without incurring additional computational costs. Our unique contribution is the design of the Q-function approximation whose format is structured as a quadratic function, by which simple but effective neural networks are used to estimate its coefficients. The results obtained through the implementation of our training platform demonstrate that the vehicle agent is able to learn a safe, smooth and timely merging policy, indicating the effectiveness and practicality of our approach.

ROMar 25, 2018
Automated Driving Maneuvers under Interactive Environment based on Deep Reinforcement Learning

Pin Wang, Ching-Yao Chan, Hanhan Li

Safe and efficient autonomous driving maneuvers in an interactive and complex environment can be considerably challenging due to the unpredictable actions of other surrounding agents that may be cooperative or adversarial in their interactions with the ego vehicle. One of the state-of-the-art approaches is to apply Reinforcement Learning (RL) to learn a time-sequential driving policy, to execute proper control strategy or tracking trajectory in dynamic situations. However, direct application of RL algorithms is not satisfactorily enough to deal with the cases in the autonomous driving domain, mainly due to the complex driving environment and continuous action space. In this paper, we adopt Q-learning as our basic learning framework and design a unique format of the Q-function approximator that consists of neural networks to handle the continuous action space challenge. The learning model is present in a closed form of continuous control variables and trained in a simulation platform that we have developed with embedded properties of real-time vehicle interactions. The proposed algorithm avoids invoking an additional actor network that learns to take actions, as in actor-critic algorithms. At the same time, some prior knowledge of vehicle dynamics is also fed into the model to assist learning. We test our algorithm with a challenging use case - lane change maneuver, to verify the practicability and feasibility of the proposed approach. Results from accumulated rewards and vehicle performance show that RL vehicle agents successfully learn a safe, comfort and efficient driving policy as defined in the reward function.

LGSep 7, 2017
Formulation of Deep Reinforcement Learning Architecture Toward Autonomous Driving for On-Ramp Merge

Pin Wang, Ching-Yao Chan

Multiple automakers have in development or in production automated driving systems (ADS) that offer freeway-pilot functions. This type of ADS is typically limited to restricted-access freeways only, that is, the transition from manual to automated modes takes place only after the ramp merging process is completed manually. One major challenge to extend the automation to ramp merging is that the automated vehicle needs to incorporate and optimize long-term objectives (e.g. successful and smooth merge) when near-term actions must be safely executed. Moreover, the merging process involves interactions with other vehicles whose behaviors are sometimes hard to predict but may influence the merging vehicle optimal actions. To tackle such a complicated control problem, we propose to apply Deep Reinforcement Learning (DRL) techniques for finding an optimal driving policy by maximizing the long-term reward in an interactive environment. Specifically, we apply a Long Short-Term Memory (LSTM) architecture to model the interactive environment, from which an internal state containing historical driving information is conveyed to a Deep Q-Network (DQN). The DQN is used to approximate the Q-function, which takes the internal state as input and generates Q-values as output for action selection. With this DRL architecture, the historical impact of interactive environment on the long-term reward can be captured and taken into account for deciding the optimal control policy. The proposed architecture has the potential to be extended and applied to other autonomous driving scenarios such as driving through a complex intersection or changing lanes under varying traffic flow conditions.