Discovering Continuous-Time Memory-Based Symbolic Policies using Genetic Programming
This work addresses the need for interpretable AI in control problems, particularly in partially observable environments, though it is incremental as it builds on existing symbolic and genetic programming techniques.
The paper tackled the problem of improving interpretability and transparency in control systems by developing symbolic policies with continuous-time memory, using genetic programming for optimization. The results show that these policies perform comparably to black-box methods on various control tasks and outperform memory-less policies in scenarios requiring memory.
Artificial intelligence techniques are increasingly being applied to solve control problems, but often rely on black-box methods without transparent output generation. To improve the interpretability and transparency in control systems, models can be defined as white-box symbolic policies described by mathematical expressions. For better performance in partially observable and volatile environments, the symbolic policies are extended with memory represented by continuous-time latent variables, governed by differential equations. Genetic programming is used for optimisation, resulting in interpretable policies consisting of symbolic expressions. Our results show that symbolic policies with memory compare with black-box policies on a variety of control tasks. Furthermore, the benefit of the memory in symbolic policies is demonstrated on experiments where memory-less policies fall short. Overall, we present a method for evolving high-performing symbolic policies that offer interpretability and transparency, which lacks in black-box models.