An AGI with Time-Inconsistent Preferences
This addresses a foundational issue in AGI design, highlighting a critical oversight in economic modeling for AI decision-making.
The paper identifies a trap in AGI theory where assuming rational AGI has time-consistent preferences is false, showing that an AGI with time-inconsistent preferences cannot trust its future self to execute optimal plans.
This paper reveals a trap for artificial general intelligence (AGI) theorists who use economists' standard method of discounting. This trap is implicitly and falsely assuming that a rational AGI would have time-consistent preferences. An agent with time-inconsistent preferences knows that its future self will disagree with its current self concerning intertemporal decision making. Such an agent cannot automatically trust its future self to carry out plans that its current self considers optimal.