Rapid Adaptation in Online Continual Learning: Are We Evaluating It Right?
This work addresses a critical evaluation flaw in OCL research, which could mislead algorithm development for applications like robotics or streaming data systems, though it is incremental as it focuses on metric refinement rather than a new learning method.
The paper identifies that the standard online accuracy metric in Online Continual Learning (OCL) is unreliable because it can be gamed by spurious label correlations, and proposes a new metric based on near-future samples to better measure adaptation, showing that existing OCL algorithms perform poorly under this metric but can improve by retaining past information.
We revisit the common practice of evaluating adaptation of Online Continual Learning (OCL) algorithms through the metric of online accuracy, which measures the accuracy of the model on the immediate next few samples. However, we show that this metric is unreliable, as even vacuous blind classifiers, which do not use input images for prediction, can achieve unrealistically high online accuracy by exploiting spurious label correlations in the data stream. Our study reveals that existing OCL algorithms can also achieve high online accuracy, but perform poorly in retaining useful information, suggesting that they unintentionally learn spurious label correlations. To address this issue, we propose a novel metric for measuring adaptation based on the accuracy on the near-future samples, where spurious correlations are removed. We benchmark existing OCL approaches using our proposed metric on large-scale datasets under various computational budgets and find that better generalization can be achieved by retaining and reusing past seen information. We believe that our proposed metric can aid in the development of truly adaptive OCL methods. We provide code to reproduce our results at https://github.com/drimpossible/EvalOCL.