Time is Encoded in the Weights of Finetuned Language Models
This provides a method for customizing language models to temporal shifts, which is incremental but useful for applications requiring up-to-date text understanding.
The authors tackled the problem of adapting language models to new time periods by introducing time vectors, which are created by finetuning on data from a specific time and subtracting the original weights, resulting in improved performance on text from that period without additional training.
We present time vectors, a simple tool to customize language models to new time periods. Time vectors are created by finetuning a language model on data from a single time (e.g., a year or month), and then subtracting the weights of the original pretrained model. This vector specifies a direction in weight space that, as our experiments show, improves performance on text from that time period. Time vectors specialized to adjacent time periods appear to be positioned closer together in a manifold. Using this structure, we interpolate between time vectors to induce new models that perform better on intervening and future time periods, without any additional training. We demonstrate the consistency of our findings across different tasks, domains, model sizes, and time scales. Our results suggest that time is encoded in the weight space of finetuned models.