CL LGSep 26, 2025

Can Prompts Rewind Time for LLMs? Evaluating the Effectiveness of Prompted Knowledge Cutoffs

Xin Gao, Ruiyi Zhang, Daniel Du, Saurabh Mahindre, Sai Ashish Somayajula, Pengtao Xie

arXiv:2510.02340v210.95 citationsh-index: 6Has CodeEMNLP

Originality Incremental advance

AI Analysis

This work addresses the need for rigorous evaluation in temporal prediction tasks for LLM users, but it is incremental as it builds on existing prompting techniques.

The study tackled the problem of whether prompting can simulate earlier knowledge cutoffs in LLMs to address contamination from pretraining data, finding that prompts are effective for direct factual forgetting but struggle with causally related knowledge.

Large Language Models (LLMs) are widely used for temporal prediction, but their reliance on pretraining data raises contamination concerns, as accurate predictions on pre-cutoff test data may reflect memorization rather than reasoning, leading to an overestimation of their generalization capability. With the recent emergence of prompting-based unlearning techniques, a natural question arises: Can LLMs be prompted to simulate an earlier knowledge cutoff? In this work, we investigate the capability of prompting to simulate earlier knowledge cutoff in LLMs. We construct three evaluation datasets to assess the extent to which LLMs can forget (1) direct factual knowledge, (2) semantic shifts, and (3) causally related knowledge. Results demonstrate that while prompt-based simulated knowledge cutoffs show effectiveness when directly queried with the information after that date, they struggle to induce forgetting when the forgotten content is not directly asked but causally related to the query. These findings highlight the need for more rigorous evaluation settings when applying LLMs for temporal prediction tasks. The full dataset and evaluation code are available at https://github.com/gxx27/time_unlearn.

View on arXiv PDF Code

Similar