Reinforcement Learning in System Identification
This addresses a recurring challenge in reinforcement learning for researchers and engineers, but appears incremental as it adapts an existing technique to a known bottleneck.
The paper tackles the problem of system identification in reinforcement learning, which faces difficulties like error accumulation and non-stationarity, by proposing to use reinforcement learning itself as a solution and shows experimental results indicating it is a promising approach.
System identification, also known as learning forward models, transfer functions, system dynamics, etc., has a long tradition both in science and engineering in different fields. Particularly, it is a recurring theme in Reinforcement Learning research, where forward models approximate the state transition function of a Markov Decision Process by learning a mapping function from current state and action to the next state. This problem is commonly defined as a Supervised Learning problem in a direct way. This common approach faces several difficulties due to the inherent complexities of the dynamics to learn, for example, delayed effects, high non-linearity, non-stationarity, partial observability and, more important, error accumulation when using bootstrapped predictions (predictions based on past predictions), over large time horizons. Here we explore the use of Reinforcement Learning in this problem. We elaborate on why and how this problem fits naturally and sound as a Reinforcement Learning problem, and present some experimental results that demonstrate RL is a promising technique to solve these kind of problems.