Structural Similarity for Improved Transfer in Reinforcement Learning
This addresses the challenge of task relationship definition for RL practitioners, though it is incremental with specific empirical validation.
The paper tackled the problem of defining relationships between source and target tasks in reinforcement learning transfer by introducing SS2, a state similarity measure based on bisimulation metrics, and showed it improves Q-Learning transfer performance in GridWorld navigation tasks.
Transfer learning is an increasingly common approach for developing performant RL agents. However, it is not well understood how to define the relationship between the source and target tasks, and how this relationship contributes to successful transfer. We present an algorithm called Structural Similarity for Two MDPS, or SS2, that calculates a state similarity measure for states in two finite MDPs based on previously developed bisimulation metrics, and show that the measure satisfies properties of a distance metric. Then, through empirical results with GridWorld navigation tasks, we provide evidence that the distance measure can be used to improve transfer performance for Q-Learning agents over previous implementations.