LGAIGTFeb 24, 2017

Strongly-Typed Agents are Guaranteed to Interact Safely

arXiv:1702.07450v22 citations
Originality Incremental advance
AI Analysis

This work addresses the safety of agent interactions in AI systems, providing theoretical guarantees for well-behaved algorithms, but it appears incremental as it builds on existing concepts like Nash equilibria and gradient descent.

The paper tackles the problem of ensuring safe interactions between artificial agents by formalizing safety as 'no harm' and focusing on gradient descent updates, showing that gradient descent converges to a Nash equilibrium in safe games and that strongly-typed agents guarantee safe interactions.

As artificial agents proliferate, it is becoming increasingly important to ensure that their interactions with one another are well-behaved. In this paper, we formalize a common-sense notion of when algorithms are well-behaved: an algorithm is safe if it does no harm. Motivated by recent progress in deep learning, we focus on the specific case where agents update their actions according to gradient descent. The paper shows that that gradient descent converges to a Nash equilibrium in safe games. The main contribution is to define strongly-typed agents and show they are guaranteed to interact safely, thereby providing sufficient conditions to guarantee safe interactions. A series of examples show that strong-typing generalizes certain key features of convexity, is closely related to blind source separation, and introduces a new perspective on classical multilinear games based on tensor decomposition.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes