Rolling with the Punches: Resilient Contrastive Pre-training under Non-Stationary Drift
This addresses a critical emerging challenge for large-scale contrastive pre-training on dynamic data streams, offering a novel solution to improve robustness in evolving environments.
The paper tackled the problem of concept drift in contrastive pre-training by proposing Resilient Contrastive Pre-training (RCP), which effectively mitigates drift-induced biases and yields more resilient representations, as demonstrated in comprehensive experiments across diverse downstream tasks.
The remarkable success of large-scale contrastive pre-training, fueled by vast and curated datasets, is encountering new frontiers as the scaling paradigm evolves. A critical emerging challenge is the effective pre-training of models on dynamic data streams characterized by concept drift, unpredictable changes in the underlying data distribution. This paper undertakes a foundational investigation of this issue. We first reveal that conventional contrastive pre-training methods are notably vulnerable to concept drift, leading to significant biases in the learned feature space of pre-trained models. To systematically analyze these effects, we construct a structural causal model that elucidates how drift acts as a confounder, distorting learned representations. Based on these causal insights, we propose Resilient Contrastive Pre-training (RCP), a novel method incorporating causal intervention. RCP introduces a causally-informed objective designed to mitigate drift-induced biases by leveraging targeted interventions. RCP is designed for simple and scalable implementation and exhibits notable adaptability, promoting robust pre-training on evolving data. Comprehensive experiments across diverse downstream tasks compellingly demonstrate that RCP effectively alleviates the detrimental impact of concept drift, yielding more resilient and generalizable representations.