AILGAug 13, 2024

Value of Information and Reward Specification in Active Inference and POMDPs

arXiv:2408.06542v15 citationsh-index: 3
Originality Incremental advance
AI Analysis

This work provides theoretical insights into the objective specification of active inference agents, which is incremental but clarifies their relationship with established RL frameworks.

The paper tackles the problem of understanding the optimality gap between active inference agents using expected free energy (EFE) and reward-driven reinforcement learning agents, showing that EFE approximates the Bayes optimal RL policy via information value.

Expected free energy (EFE) is a central quantity in active inference which has recently gained popularity due to its intuitive decomposition of the expected value of control into a pragmatic and an epistemic component. While numerous conjectures have been made to justify EFE as a decision making objective function, the most widely accepted is still its intuitiveness and resemblance to variational free energy in approximate Bayesian inference. In this work, we take a bottom up approach and ask: taking EFE as given, what's the resulting agent's optimality gap compared with a reward-driven reinforcement learning (RL) agent, which is well understood? By casting EFE under a particular class of belief MDP and using analysis tools from RL theory, we show that EFE approximates the Bayes optimal RL policy via information value. We discuss the implications for objective specification of active inference agents.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes