Composing Task Knowledge with Modular Successor Feature Approximators
This work addresses a bottleneck in reinforcement learning for agents needing to transfer knowledge across tasks, though it appears incremental as it builds directly on the SF&GPI framework.
The paper tackles the problem of requiring hand-designed state features in the Successor Features and Generalized Policy Improvement (SF&GPI) framework by introducing Modular Successor Feature Approximators (MSFA), a neural network architecture that autonomously discovers useful features and learns predictive representations, resulting in improved generalization compared to baseline methods.
Recently, the Successor Features and Generalized Policy Improvement (SF&GPI) framework has been proposed as a method for learning, composing, and transferring predictive knowledge and behavior. SF&GPI works by having an agent learn predictive representations (SFs) that can be combined for transfer to new tasks with GPI. However, to be effective this approach requires state features that are useful to predict, and these state-features are typically hand-designed. In this work, we present a novel neural network architecture, "Modular Successor Feature Approximators" (MSFA), where modules both discover what is useful to predict, and learn their own predictive representations. We show that MSFA is able to better generalize compared to baseline architectures for learning SFs and modular architectures