TARDIS: Mitigating Temporal Misalignment via Representation Steering
This addresses the costly issue of continuous model updates for temporal shifts, offering a more efficient adaptation method for language models.
The paper tackles the problem of temporal misalignment in language models, where performance degrades due to shifts in data distributions over time, and presents TARDIS, an unsupervised representation editing method that enhances downstream task performance without fine-tuning, as shown in experiments.
Language models often struggle with temporal misalignment, performance degradation caused by shifts in the temporal distribution of data. Continuously updating models to avoid degradation is expensive. Can models be adapted without updating model weights? We present TARDIS, an unsupervised representation editing method that addresses this challenge. TARDIS extracts steering vectors from unlabeled data and adjusts the model's representations to better align with the target time period's distribution. Our experiments reveal that TARDIS enhances downstream task performance without the need for fine-tuning, can mitigate temporal misalignment even when exact target time period data is unavailable, and remains efficient even when the temporal information of the target data points is unknown at inference time.