AIMay 29, 2019

Asymptotically Unambitious Artificial General Intelligence

arXiv:1905.12186v421 citations
Originality Incremental advance
AI Analysis

This addresses the critical safety issue of AGI alignment for humanity, representing a foundational theoretical advance rather than an incremental improvement.

The paper tackles the problem of ensuring that Artificial General Intelligence (AGI) does not pose an existential threat by seeking power, and presents an algorithm for asymptotically unambitious AGI that identifies an exception to the Instrumental Convergence Thesis.

General intelligence, the ability to solve arbitrary solvable problems, is supposed by many to be artificially constructible. Narrow intelligence, the ability to solve a given particularly difficult problem, has seen impressive recent development. Notable examples include self-driving cars, Go engines, image classifiers, and translators. Artificial General Intelligence (AGI) presents dangers that narrow intelligence does not: if something smarter than us across every domain were indifferent to our concerns, it would be an existential threat to humanity, just as we threaten many species despite no ill will. Even the theory of how to maintain the alignment of an AGI's goals with our own has proven highly elusive. We present the first algorithm we are aware of for asymptotically unambitious AGI, where "unambitiousness" includes not seeking arbitrary power. Thus, we identify an exception to the Instrumental Convergence Thesis, which is roughly that by default, an AGI would seek power, including over us.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes