SYFeb 28, 2017
Model-based reinforcement learning in differential graphical gamesRushikesh Kamalapurkar, Justin R. Klotz, Patrick Walters et al.
This paper seeks to combine differential game theory with the actor-critic-identifier architecture to determine forward-in-time, approximate optimal controllers for formation tracking in multi-agent systems, where the agents have uncertain heterogeneous nonlinear dynamics. A continuous control strategy is proposed, using communication feedback from extended neighbors on a communication topology that has a spanning tree. A model-based reinforcement learning technique is developed to cooperatively control a group of agents to track a trajectory in a desired formation. Simulation results are presented to demonstrate the performance of the developed technique.
SYOct 28, 2017
Online Approximate Optimal Station Keeping of a Marine Craft in the Presence of a CurrentPatrick Walters, Rushikesh Kamalapurkar, Forrest Voight et al.
Online approximation of the optimal station keeping strategy for a fully actuated six degrees-of-freedom marine craft subject to an irrotational ocean current is considered. An approximate solution to the optimal control problem is obtained using an adaptive dynamic programming technique. The hydrodynamic drift dynamics of the dynamic model are assumed to be unknown; therefore, a concurrent learning-based system identifier is developed to identify the unknown model parameters. The identified model is used to implement an adaptive model-based reinforcement learning technique to estimate the unknown value function. The developed policy guarantees uniformly ultimately bounded convergence of the vehicle to the desired station and uniformly ultimately bounded convergence of the approximated policies to the optimal polices without the requirement of persistence of excitation. The developed strategy is validated using an autonomous underwater vehicle, where the three degrees-of-freedom in the horizontal plane are regulated. The experiments are conducted in a second-magnitude spring located in central Florida.
SYJun 1, 2015
Model-based reinforcement learning for infinite-horizon approximate optimal trackingRushikesh Kamalapurkar, Lindsey Andrews, Patrick Walters et al.
This paper provides an approximate online adaptive solution to the infinite-horizon optimal tracking problem for control-affine continuous-time nonlinear systems with unknown drift dynamics. Model-based reinforcement learning is used to relax the persistence of excitation condition. Model-based reinforcement learning is implemented using a concurrent learning-based system identifier to simulate experience by evaluating the Bellman error over unexplored areas of the state space. Tracking of the desired trajectory and convergence of the developed policy to a neighborhood of the optimal policy are established via Lyapunov-based stability analysis. Simulation results demonstrate the effectiveness of the developed technique.
SYSep 30, 2013
Online Approximate Optimal Station Keeping of an Autonomous Underwater VehiclePatrick Walters, Warren E. Dixon
Online approximation of an optimal station keeping strategy for a fully actuated six degrees-of-freedom autonomous underwater vehicle is considered. The developed controller is an approximation of the solution to a two player zero-sum game where the controller is the minimizing player and an external disturbance is the maximizing player. The solution is approximated using a reinforcement learning-based actor-critic framework. The result guarantees uniformly ultimately bounded (UUB) convergence of the states and UUB convergence of the approximated policies to the optimal polices without the requirement of persistence of excitation.