Average Cost Optimal Control of Stochastic Systems Using Reinforcement Learning
This work addresses control optimization for stochastic systems, which is incremental as it applies reinforcement learning to a known problem with specific noise types.
The paper tackles the average cost minimization problem for discrete-time stochastic systems with multiplicative and additive noises using reinforcement learning, proposing an online model-free algorithm that estimates the Q-function kernel matrix and updates control gain, with proven convergence to optimal values and a numerical example for illustration.
This paper addresses the average cost minimization problem for discrete-time systems with multiplicative and additive noises via reinforcement learning. By using Q-function, we propose an online learning scheme to estimate the kernel matrix of Q-function and to update the control gain using the data along the system trajectories. The obtained control gain and kernel matrix are proved to converge to the optimal ones. To implement the proposed learning scheme, an online model-free reinforcement learning algorithm is given, where recursive least squares method is used to estimate the kernel matrix of Q-function. A numerical example is presented to illustrate the proposed approach.