Hierarchical Support Vector State Partitioning for Distilling Black Box Reinforcement Learning Policies
Provides a more flexible and interpretable distillation method for black-box RL policies, improving performance and reducing complexity for practitioners needing transparent agents.
SVSP distills black-box RL policies into interpretable subpolicies using SVM-based state partitioning, improving mean return by +7.4% over VSP and +2.8% over the original TD3 policy while reducing subpolicy count by 82.1%.
We introduce State Vector Space Partitioning (SVSP), a novel method to mimic a black box reinforcement learning policy using a set of human-interpretable subpolicies. By partitioning a distillation dataset of state action pairs with linear support vector machine splits, SVSP constructs a compact and structured representation of the original policy. Our method improves mean return by +7.4\% over previous critic driven state partitioning attempts such as Voronoi State Partitioning (VSP) and +2.8\% over the original TD3 policy, while reducing the number of required subpolicies against VSP by 82.1\%. Our results pave the path towards a more flexible form of distillation where both the decision boundary and surrogate models can be chosen within a margin of the original black box behavior.