CL AIApr 5, 2025

Rethinking Reflection in Pre-Training

Essential AI, Darsh J Shah, Peter Rushton, Somanshu Singla, Mohit Parmar, Kurt Smith, Yash Vanjani, Ashish Vaswani, Adarsh Chaluvaraju, Andrew Hojel, Andrew Ma, Anil Thomas

arXiv:2504.04022v129.344 citationsh-index: 24Has Code

Originality Incremental advance

AI Analysis

This addresses the problem of understanding early cognitive development in AI for researchers, though it is incremental as it builds on existing work on reflection.

The study investigated the emergence of self-reflection in language models during pre-training, finding that models can recognize and correct deliberate errors in reasoning chains, with an OLMo2-7B model showing this ability after training on 4 trillion tokens.

A language model's ability to reflect on its own reasoning provides a key advantage for solving complex problems. While most recent research has focused on how this ability develops during reinforcement learning, we show that it actually begins to emerge much earlier - during the model's pre-training. To study this, we introduce deliberate errors into chains-of-thought and test whether the model can still arrive at the correct answer by recognizing and correcting these mistakes. By tracking performance across different stages of pre-training, we observe that this self-correcting ability appears early and improves steadily over time. For instance, an OLMo2-7B model pre-trained on 4 trillion tokens displays self-correction on our six self-reflection tasks.

View on arXiv PDF Code

Similar