Discovering Behavioral Modes in Deep Reinforcement Learning Policies Using Trajectory Clustering in Latent Space
This work addresses the challenge of interpreting DRL agents for researchers and practitioners, but it is incremental as it applies existing clustering techniques to a new context.
The paper tackles the problem of understanding complex deep reinforcement learning (DRL) policies by introducing a method using dimensionality reduction and trajectory clustering in latent space to identify behavior modes and suboptimal choices, demonstrated on the Mountain Car task to enable targeted performance improvements.
Understanding the behavior of deep reinforcement learning (DRL) agents is crucial for improving their performance and reliability. However, the complexity of their policies often makes them challenging to understand. In this paper, we introduce a new approach for investigating the behavior modes of DRL policies, which involves utilizing dimensionality reduction and trajectory clustering in the latent space of neural networks. Specifically, we use Pairwise Controlled Manifold Approximation Projection (PaCMAP) for dimensionality reduction and TRACLUS for trajectory clustering to analyze the latent space of a DRL policy trained on the Mountain Car control task. Our methodology helps identify diverse behavior patterns and suboptimal choices by the policy, thus allowing for targeted improvements. We demonstrate how our approach, combined with domain knowledge, can enhance a policy's performance in specific regions of the state space.