LGDec 2, 2025

Retrieval-Augmented Memory for Online Learning

arXiv:2512.02333v1h-index: 1
Originality Incremental advance
AI Analysis

This addresses the problem of concept drift in streaming supervised learning for applications like electricity pricing and airline delays, offering a practical tool with incremental improvements over existing methods.

The paper tackles online classification in non-stationary environments with concept drift by proposing RAM-OL, a retrieval-augmented memory extension of stochastic gradient descent. It shows improvements in prequential accuracy by up to about seven percentage points on drifting data streams.

Retrieval-augmented models couple parametric predictors with non-parametric memories, but their use in streaming supervised learning with concept drift is not well understood. We study online classification in non-stationary environments and propose Retrieval-Augmented Memory for Online Learning (RAM-OL), a simple extension of stochastic gradient descent that maintains a small buffer of past examples. At each time step, RAM-OL retrieves a few nearest neighbours of the current input in the hidden representation space and updates the model jointly on the current example and the retrieved neighbours. We compare a naive replay variant with a gated replay variant that constrains neighbours using a time window, similarity thresholds, and gradient reweighting, in order to balance fast reuse of relevant past data against robustness to outdated regimes. From a theoretical perspective, we interpret RAM-OL under a bounded drift model and discuss how retrieval can reduce adaptation cost and improve regret constants when patterns recur over time. Empirically, we instantiate RAM-OL on a simple online multilayer perceptron and evaluate it on three real-world data streams derived from electricity pricing, electricity load, and airline delay data. On strongly and periodically drifting streams, RAM-OL improves prequential accuracy by up to about seven percentage points and greatly reduces variance across random seeds, while on a noisy airline stream the gated variant closely matches the purely online baseline. These results show that retrieval-augmented memory is a practical and robust tool for online learning under concept drift.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes