THAISep 25, 2022

Exploring the Constraints on Artificial General Intelligence: A Game-Theoretic No-Go Theorem

arXiv:2209.12346v2h-index: 5
AI Analysis

This work addresses the theoretical safety concerns for policymakers and AI researchers regarding superhuman AI, though it is incremental as it builds on existing game-theoretic concepts.

The paper tackles the problem of ensuring safe interactions between humans and potential superhuman AI by proposing a game-theoretic framework, resulting in an impossibility theorem showing that four key assumptions are inconsistent together but become consistent if any one is relaxed, with policy recommendations for data control and researcher access.

The emergence of increasingly sophisticated artificial intelligence (AI) systems have sparked intense debate among researchers, policymakers, and the public due to their potential to surpass human intelligence and capabilities in all domains. In this paper, I propose a game-theoretic framework that captures the strategic interactions between a human agent and a potential superhuman machine agent. I identify four key assumptions: Strategic Unpredictability, Access to Machine's Strategy, Rationality, and Superhuman Machine. The main result of this paper is an impossibility theorem: these four assumptions are inconsistent when taken together, but relaxing any one of them results in a consistent set of assumptions. Two straightforward policy recommendations follow: first, policymakers should control access to specific human data to maintain Strategic Unpredictability; and second, they should grant select AI researchers access to superhuman machine research to ensure Access to Machine's Strategy holds. My analysis contributes to a better understanding of the context that can shape the theoretical development of superhuman AI.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes