Deep Active Inference for Partially Observable MDPs
This work addresses the problem of applying deep active inference to partially observable environments for researchers in reinforcement learning, representing an incremental advancement.
The paper tackled the limitation of deep active inference to fully observable domains by developing a model that learns policies from high-dimensional sensory inputs, achieving comparable or better performance than deep Q-learning on the OpenAI benchmark.
Deep active inference has been proposed as a scalable approach to perception and action that deals with large policy and state spaces. However, current models are limited to fully observable domains. In this paper, we describe a deep active inference model that can learn successful policies directly from high-dimensional sensory inputs. The deep learning architecture optimizes a variant of the expected free energy and encodes the continuous state representation by means of a variational autoencoder. We show, in the OpenAI benchmark, that our approach has comparable or better performance than deep Q-learning, a state-of-the-art deep reinforcement learning algorithm.