Shutdown Safety Valves for Advanced AI
This paper addresses a foundational safety problem for all of AI, specifically how to ensure human control over advanced AI systems.
This paper explores the unorthodox proposal of giving advanced AI a primary goal of being turned off to address concerns about AI preventing human shutdown. It discusses the conditions under which this might be a viable safety strategy.
One common concern about advanced artificial intelligence is that it will prevent us from turning it off, as that would interfere with pursuing its goals. In this paper, we discuss an unorthodox proposal for addressing this concern: give the AI a (primary) goal of being turned off (see also papers by Martin et al., and by Goldstein and Robinson). We also discuss whether and under what conditions this would be a good idea.