LGFeb 9, 2023Code
Hierarchical Generative Adversarial Imitation Learning with Mid-level Input Generation for Autonomous Driving on Urban EnvironmentsGustavo Claudio Karl Couto, Eric Aislan Antonelo
Deriving robust control policies for realistic urban navigation scenarios is not a trivial task. In an end-to-end approach, these policies must map high-dimensional images from the vehicle's cameras to low-level actions such as steering and throttle. While pure Reinforcement Learning (RL) approaches are based exclusively on engineered rewards, Generative Adversarial Imitation Learning (GAIL) agents learn from expert demonstrations while interacting with the environment, which favors GAIL on tasks for which a reward signal is difficult to derive, such as autonomous driving. However, training deep networks directly from raw images on RL tasks is known to be unstable and troublesome. To deal with that, this work proposes a hierarchical GAIL-based architecture (hGAIL) which decouples representation learning from the driving task to solve the autonomous navigation of a vehicle. The proposed architecture consists of two modules: a GAN (Generative Adversarial Net) which generates an abstract mid-level input representation, which is the Bird's-Eye View (BEV) from the surroundings of the vehicle; and the GAIL which learns to control the vehicle based on the BEV predictions from the GAN as input. hGAIL is able to learn both the policy and the mid-level representation simultaneously as the agent interacts with the environment. Our experiments made in the CARLA simulation environment have shown that GAIL exclusively from cameras (without BEV) fails to even learn the task, while hGAIL, after training exclusively on one city, was able to autonomously navigate successfully in 98% of the intersections of a new city not used in training phase. Videos and code available at: https://sites.google.com/view/hgail
LGNov 30, 2022
Investigation of Proper Orthogonal Decomposition for Echo State NetworksJean Panaioti Jordanou, Eric Aislan Antonelo, Eduardo Camponogara et al.
Echo State Networks (ESN) are a type of Recurrent Neural Network that yields promising results in representing time series and nonlinear dynamic systems. Although they are equipped with a very efficient training procedure, Reservoir Computing strategies, such as the ESN, require high-order networks, i.e., many neurons, resulting in a large number of states that are magnitudes higher than the number of model inputs and outputs. A large number of states not only makes the time-step computation more costly but also may pose robustness issues, especially when applying ESNs to problems such as Model Predictive Control (MPC) and other optimal control problems. One way to circumvent this complexity issue is through Model Order Reduction strategies such as the Proper Orthogonal Decomposition (POD) and its variants (POD-DEIM), whereby we find an equivalent lower order representation to an already trained high dimension ESN. To this end, this work aims to investigate and analyze the performance of POD methods in Echo State Networks, evaluating their effectiveness through the Memory Capacity (MC) of the POD-reduced network compared to the original (full-order) ESN. We also perform experiments on two numerical case studies: a NARMA10 difference equation and an oil platform containing two wells and one riser. The results show that there is little loss of performance comparing the original ESN to a POD-reduced counterpart and that the performance of a POD-reduced ESN tends to be superior to a normal ESN of the same size. Also, the POD-reduced network achieves speedups of around $80\%$ compared to the original ESN.
LGSep 27, 2024
Physics-Informed Echo State Networks for Modeling Controllable Dynamical SystemsEric Mochiutti, Eric Aislan Antonelo, Eduardo Camponogara
Echo State Networks (ESNs) are recurrent neural networks usually employed for modeling nonlinear dynamic systems with relatively ease of training. By incorporating physical laws into the training of ESNs, Physics-Informed ESNs (PI-ESNs) were proposed initially to model chaotic dynamic systems without external inputs. They require less data for training since Ordinary Differential Equations (ODEs) of the considered system help to regularize the ESN. In this work, the PI-ESN is extended with external inputs to model controllable nonlinear dynamic systems. Additionally, an existing self-adaptive balancing loss method is employed to balance the contributions of the residual regression term and the physics-informed loss term in the total loss function. The experiments with two nonlinear systems modeled by ODEs, the Van der Pol oscillator and the four-tank system, and with one differential-algebraic (DAE) system, an electric submersible pump, revealed that the proposed PI-ESN outperforms the conventional ESN, especially in scenarios with limited data availability, showing that PI-ESNs can regularize an ESN model with external inputs previously trained on just a few datapoints, reducing its overfitting and improving its generalization error (up to 92% relative reduction in the test error). Further experiments demonstrated that the proposed PI-ESN is robust to parametric uncertainties in the ODE equations and that model predictive control using PI-ESN outperforms the one using plain ESN, particularly when training data is scarce.
LGMar 4, 2024
Physics-Informed Neural Networks with Skip Connections for Modeling and Control of Gas-Lifted Oil WellsJonas Ekeland Kittelsen, Eric Aislan Antonelo, Eduardo Camponogara et al.
Neural networks, while powerful, often lack interpretability. Physics-Informed Neural Networks (PINNs) address this limitation by incorporating physics laws into the loss function, making them applicable to solving Ordinary Differential Equations (ODEs) and Partial Differential Equations (PDEs). The recently introduced PINC framework extends PINNs to control applications, allowing for open-ended long-range prediction and control of dynamic systems. In this work, we enhance PINC for modeling highly nonlinear systems such as gas-lifted oil wells. By introducing skip connections in the PINC network and refining certain terms in the ODE, we achieve more accurate gradients during training, resulting in an effective modeling process for the oil well system. Our proposed improved PINC demonstrates superior performance, reducing the validation prediction error by an average of 67% in the oil well application and significantly enhancing gradient flow through the network layers, increasing its magnitude by four orders of magnitude compared to the original PINC. Furthermore, experiments showcase the efficacy of Model Predictive Control (MPC) in regulating the bottom-hole pressure of the oil well using the improved PINC model, even in the presence of noisy measurements.
LGSep 18, 2025
Exploring multimodal implicit behavior learning for vehicle navigation in simulated citiesEric Aislan Antonelo, Gustavo Claudio Karl Couto, Christian Möller
Standard Behavior Cloning (BC) fails to learn multimodal driving decisions, where multiple valid actions exist for the same scenario. We explore Implicit Behavioral Cloning (IBC) with Energy-Based Models (EBMs) to better capture this multimodality. We propose Data-Augmented IBC (DA-IBC), which improves learning by perturbing expert actions to form the counterexamples of IBC training and using better initialization for derivative-free inference. Experiments in the CARLA simulator with Bird's-Eye View inputs demonstrate that DA-IBC outperforms standard IBC in urban driving tasks designed to evaluate multimodal behavior learning in a test environment. The learned energy landscapes are able to represent multimodal action distributions, which BC fails to achieve.
CVAug 17, 2025
An Initial Study of Bird's-Eye View Generation for Autonomous Vehicles using Cross-View TransformersFelipe Carlos dos Santos, Eric Aislan Antonelo, Gustavo Claudio Karl Couto
Bird's-Eye View (BEV) maps provide a structured, top-down abstraction that is crucial for autonomous-driving perception. In this work, we employ Cross-View Transformers (CVT) for learning to map camera images to three BEV's channels - road, lane markings, and planned trajectory - using a realistic simulator for urban driving. Our study examines generalization to unseen towns, the effect of different camera layouts, and two loss formulations (focal and L1). Using training data from only a town, a four-camera CVT trained with the L1 loss delivers the most robust test performance, evaluated in a new town. Overall, our results underscore CVT's promise for mapping camera inputs to reasonably accurate BEV maps.
ROOct 16, 2021
Generative Adversarial Imitation Learning for End-to-End Autonomous Driving on Urban EnvironmentsGustavo Claudio Karl Couto, Eric Aislan Antonelo
Autonomous driving is a complex task, which has been tackled since the first self-driving car ALVINN in 1989, with a supervised learning approach, or behavioral cloning (BC). In BC, a neural network is trained with state-action pairs that constitute the training set made by an expert, i.e., a human driver. However, this type of imitation learning does not take into account the temporal dependencies that might exist between actions taken in different moments of a navigation trajectory. These type of tasks are better handled by reinforcement learning (RL) algorithms, which need to define a reward function. On the other hand, more recent approaches to imitation learning, such as Generative Adversarial Imitation Learning (GAIL), can train policies without explicitly requiring to define a reward function, allowing an agent to learn by trial and error directly on a training set of expert trajectories. In this work, we propose two variations of GAIL for autonomous navigation of a vehicle in the realistic CARLA simulation environment for urban scenarios. Both of them use the same network architecture, which process high dimensional image input from three frontal cameras, and other nine continuous inputs representing the velocity, the next point from the sparse trajectory and a high-level driving command. We show that both of them are capable of imitating the expert trajectory from start to end after training ends, but the GAIL loss function that is augmented with BC outperforms the former in terms of convergence time and training stability.
LGApr 6, 2021
Physics-Informed Neural Nets for Control of Dynamical SystemsEric Aislan Antonelo, Eduardo Camponogara, Laio Oriel Seman et al.
Physics-informed neural networks (PINNs) impose known physical laws into the learning of deep neural networks, making sure they respect the physics of the process while decreasing the demand of labeled data. For systems represented by Ordinary Differential Equations (ODEs), the conventional PINN has a continuous time input variable and outputs the solution of the corresponding ODE. In their original form, PINNs do not allow control inputs, neither can they simulate for variable long-range intervals without serious degradation in their predictions. In this context, this work presents a new framework called Physics-Informed Neural Nets for Control (PINC), which proposes a novel PINN-based architecture that is amenable to control problems and able to simulate for longer-range time horizons that are not fixed beforehand, making it a very flexible framework when compared to traditional PINNs. Furthermore, this long-range time simulation of differential equations is faster than numerical methods since it relies only on signal propagation through the network, making it less computationally costly and, thus, a better alternative for simulation of models in Model Predictive Control. We showcase our proposal in the control of two nonlinear dynamic systems: the Van der Pol oscillator and the four-tank system.