AIJun 2, 2016

Death and Suicide in Universal Artificial Intelligence

arXiv:1606.00652v125 citations
Originality Incremental advance
AI Analysis

This work addresses a foundational issue in AI safety and agent theory for researchers in universal AI, but it is incremental as it builds on existing AIXI theory.

The paper tackles the problem of interpreting the shortfall in semimeasures used in universal reinforcement learning (AIXI) as an agent's estimate of its own death probability, and proves that agent behavior can drastically shift from suicidal to self-preserving under reward transformations, with the agent's survival belief increasing over time.

Reinforcement learning (RL) is a general paradigm for studying intelligent behaviour, with applications ranging from artificial intelligence to psychology and economics. AIXI is a universal solution to the RL problem; it can learn any computable environment. A technical subtlety of AIXI is that it is defined using a mixture over semimeasures that need not sum to 1, rather than over proper probability measures. In this work we argue that the shortfall of a semimeasure can naturally be interpreted as the agent's estimate of the probability of its death. We formally define death for generally intelligent agents like AIXI, and prove a number of related theorems about their behaviour. Notable discoveries include that agent behaviour can change radically under positive linear transformations of the reward signal (from suicidal to dogmatically self-preserving), and that the agent's posterior belief that it will survive increases over time.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes