A New Representation of Successor Features for Transfer across Dissimilar Environments
This addresses a key limitation in RL transfer for real-world applications where environments vary, though it appears incremental as it builds on successor features with a new modeling technique.
The paper tackles the problem of transferring reinforcement learning knowledge across environments with different dynamics, proposing a Gaussian Process-based successor features approach that theoretically converges and outperforms existing baselines in benchmarks.
Transfer in reinforcement learning is usually achieved through generalisation across tasks. Whilst many studies have investigated transferring knowledge when the reward function changes, they have assumed that the dynamics of the environments remain consistent. Many real-world RL problems require transfer among environments with different dynamics. To address this problem, we propose an approach based on successor features in which we model successor feature functions with Gaussian Processes permitting the source successor features to be treated as noisy measurements of the target successor feature function. Our theoretical analysis proves the convergence of this approach as well as the bounded error on modelling successor feature functions with Gaussian Processes in environments with both different dynamics and rewards. We demonstrate our method on benchmark datasets and show that it outperforms current baselines.