LG MLSep 26, 2013

Approximate Kalman Filter Q-Learning for Continuous State-Space MDPs

arXiv:1309.6868v17 citations

Originality Incremental advance

AI Analysis

This work addresses a domain-specific challenge in reinforcement learning for continuous state spaces, offering an incremental improvement over existing methods.

The paper tackled the problem of learning effective policies for Markov Decision Processes with continuous states via Q-Learning by developing an approximate Kalman filter model to estimate linear weights that minimize the mean Bellman residual, resulting in outperforming current state-of-the-art projected TD-Learning methods on several standard benchmarks.

We seek to learn an effective policy for a Markov Decision Process (MDP) with continuous states via Q-Learning. Given a set of basis functions over state action pairs we search for a corresponding set of linear weights that minimizes the mean Bellman residual. Our algorithm uses a Kalman filter model to estimate those weights and we have developed a simpler approximate Kalman filter model that outperforms the current state of the art projected TD-Learning methods on several standard benchmark problems.

View on arXiv PDF

Similar