Reinforcement Learning for Learning of Dynamical Systems in Uncertain Environment: a Tutorial
This is an incremental tutorial aimed at researchers or practitioners seeking an overview of existing RL methods for dynamical systems.
The paper provides a tutorial review of model-free reinforcement learning algorithms, including Temporal Difference, Q-Learning, and Approximate Q-learning, for learning dynamical systems in uncertain environments, without presenting new experimental results or numerical outcomes.
In this paper, a review of model-free reinforcement learning for learning of dynamical systems in uncertain environments has discussed. For this purpose, the Markov Decision Process (MDP) will be reviewed. Furthermore, some learning algorithms such as Temporal Difference (TD) learning, Q-Learning, and Approximate Q-learning as model-free algorithms which constitute the main part of this article have been investigated, and benefits and drawbacks of each algorithm will be discussed. The discussed concepts in each section are explaining with details and examples.