Formalizing Embeddedness Failures in Universal Artificial Intelligence
This work addresses foundational issues in AI theory for researchers in universal artificial intelligence and embedded agency, but it is incremental as it builds on existing critiques and formalizations.
The paper formalizes and proves known failure modes of the AIXI reinforcement learning agent as a model of embedded agency, focusing on a variant that uses the universal distribution for action/percept histories, and evaluates progress toward a theory of embedded agency based on AIXI variants.
We rigorously discuss the commonly asserted failures of the AIXI reinforcement learning agent as a model of embedded agency. We attempt to formalize these failure modes and prove that they occur within the framework of universal artificial intelligence, focusing on a variant of AIXI that models the joint action/percept history as drawn from the universal distribution. We also evaluate the progress that has been made towards a successful theory of embedded agency based on variants of the AIXI agent.