Interpretable and Explainable Logical Policies via Neurally Guided Symbolic Abstraction
This addresses the need for understandable AI policies in reinforcement learning, though it is incremental as it builds on neuro-symbolic RL methods.
The paper tackled the problem of creating interpretable and explainable policies in reinforcement learning by introducing NUDGE, which uses neural networks to guide the search for logic rules and trains them with differentiable logic, resulting in policies that outperform purely neural ones and show good flexibility.
The limited priors required by neural networks make them the dominating choice to encode and learn policies using reinforcement learning (RL). However, they are also black-boxes, making it hard to understand the agent's behaviour, especially when working on the image level. Therefore, neuro-symbolic RL aims at creating policies that are interpretable in the first place. Unfortunately, interpretability is not explainability. To achieve both, we introduce Neurally gUided Differentiable loGic policiEs (NUDGE). NUDGE exploits trained neural network-based agents to guide the search of candidate-weighted logic rules, then uses differentiable logic to train the logic agents. Our experimental evaluation demonstrates that NUDGE agents can induce interpretable and explainable policies while outperforming purely neural ones and showing good flexibility to environments of different initial states and problem sizes.