Continual Learning in Linear Classification on Separable Data
This work provides theoretical insights for continual learning in classification, though it is incremental as it builds on existing frameworks like POCS.
The paper tackles continual learning for separable linear classification tasks by showing that weak regularization leads to a sequential max-margin problem, and it develops upper bounds on forgetting under various task orderings like cyclic and random.
We analyze continual learning on a sequence of separable linear classification tasks with binary labels. We show theoretically that learning with weak regularization reduces to solving a sequential max-margin problem, corresponding to a special case of the Projection Onto Convex Sets (POCS) framework. We then develop upper bounds on the forgetting and other quantities of interest under various settings with recurring tasks, including cyclic and random orderings of tasks. We discuss several practical implications to popular training practices like regularization scheduling and weighting. We point out several theoretical differences between our continual classification setting and a recently studied continual regression setting.