Inference Scaling Reshapes AI Governance
This addresses governance challenges for policymakers and AI developers, but it is incremental as it builds on existing scaling discussions without new empirical data.
The paper examines how shifting from scaling pre-training compute to scaling inference compute could reshape AI governance, depending on whether inference is used during deployment or training, with effects including reduced importance of open-weight models and altered business models.
The shift from scaling up the pre-training compute of AI systems to scaling up their inference compute may have profound effects on AI governance. The nature of these effects depends crucially on whether this new inference compute will primarily be used during external deployment or as part of a more complex training programme within the lab. Rapid scaling of inference-at-deployment would: lower the importance of open-weight models (and of securing the weights of closed models), reduce the impact of the first human-level models, change the business model for frontier AI, reduce the need for power-intense data centres, and derail the current paradigm of AI governance via training compute thresholds. Rapid scaling of inference-during-training would have more ambiguous effects that range from a revitalisation of pre-training scaling to a form of recursive self-improvement via iterated distillation and amplification.