LG AIJun 17, 2024

Transcendence: Generative Models Can Outperform The Experts That Train Them

Edwin Zhang, Vincent Zhu, Naomi Saphra, Anat Kleiman, Benjamin L. Edelman, Milind Tambe, Sham M. Kakade, Eran Malach

arXiv:2406.11741v418.825 citationsh-index: 75

Originality Incremental advance

AI Analysis

This addresses the limitation of generative models for AI systems by showing they can exceed human-level performance, though it is incremental as it builds on existing methods with a new theoretical insight.

The paper tackles the problem of generative models merely imitating data by introducing 'transcendence', where models surpass the experts generating their data, and demonstrates this by training a transformer on chess transcripts that outperforms all players in the dataset.

Generative models are trained with the simple objective of imitating the conditional probability distribution induced by the data they are trained on. Therefore, when trained on data generated by humans, we may not expect the artificial model to outperform the humans on their original objectives. In this work, we study the phenomenon of transcendence: when a generative model achieves capabilities that surpass the abilities of the experts generating its data. We demonstrate transcendence by training an autoregressive transformer to play chess from game transcripts, and show that the trained model can sometimes achieve better performance than all players in the dataset. We theoretically prove that transcendence can be enabled by low-temperature sampling, and rigorously assess this claim experimentally. Finally, we discuss other sources of transcendence, laying the groundwork for future investigation of this phenomenon in a broader setting.

View on arXiv PDF

Similar