Artificial Intelligent Disobedience: Rethinking the Agency of Our Artificial Teammates
It addresses the problem of rigid obedience in cooperative AI systems, which can be counterproductive or unsafe for users, though it is incremental as it builds on existing autonomy research.
The paper argues that AI teammates should be capable of intelligent disobedience to enhance safety and productivity in human-AI teams, proposing a scale of AI agency levels and initial boundaries for studying this capability.
Artificial intelligence has made remarkable strides in recent years, achieving superhuman performance across a wide range of tasks. Yet despite these advances, most cooperative AI systems remain rigidly obedient, designed to follow human instructions without question and conform to user expectations, even when doing so may be counterproductive or unsafe. This paper argues for expanding the agency of AI teammates to include \textit{intelligent disobedience}, empowering them to make meaningful and autonomous contributions within human-AI teams. It introduces a scale of AI agency levels and uses representative examples to highlight the importance and growing necessity of treating AI autonomy as an independent research focus in cooperative settings. The paper then explores how intelligent disobedience manifests across different autonomy levels and concludes by proposing initial boundaries and considerations for studying disobedience as a core capability of artificial agents.