A Direct Approximation of AIXI Using Logical State Abstractions
This work addresses the problem of scaling AIXI to practical, complex environments for reinforcement learning researchers, though it appears incremental as it builds on existing state abstraction and Bayesian methods.
The authors tackled the challenge of approximating AIXI, a Bayesian optimality notion for reinforcement learning, in complex environments by integrating logical state abstractions, enabling agents to handle non-Markovian and structured settings with improved performance validated on epidemic control in large-scale networks.
We propose a practical integration of logical state abstraction with AIXI, a Bayesian optimality notion for reinforcement learning agents, to significantly expand the model class that AIXI agents can be approximated over to complex history-dependent and structured environments. The state representation and reasoning framework is based on higher-order logic, which can be used to define and enumerate complex features on non-Markovian and structured environments. We address the problem of selecting the right subset of features to form state abstractions by adapting the $Φ$-MDP optimisation criterion from state abstraction theory. Exact Bayesian model learning is then achieved using a suitable generalisation of Context Tree Weighting over abstract state sequences. The resultant architecture can be integrated with different planning algorithms. Experimental results on controlling epidemics on large-scale contact networks validates the agent's performance.