Batch-iFDD for Representation Expansion in Large MDPs
This addresses a bottleneck for researchers and practitioners in reinforcement learning by improving scalability in large MDPs, though it is incremental as it builds on existing matching pursuit methods.
The paper tackles the scalability issue in matching pursuit methods for value function approximation by introducing Batch-iFDD, which eliminates the need for a large feature pool and reduces computational complexity, achieving empirical results across domains with up to one million states.
Matching pursuit (MP) methods are a promising class of feature construction algorithms for value function approximation. Yet existing MP methods require creating a pool of potential features, mandating expert knowledge or enumeration of a large feature pool, both of which hinder scalability. This paper introduces batch incremental feature dependency discovery (Batch-iFDD) as an MP method that inherits a provable convergence property. Additionally, Batch-iFDD does not require a large pool of features, leading to lower computational complexity. Empirical policy evaluation results across three domains with up to one million states highlight the scalability of Batch-iFDD over the previous state of the art MP algorithm.