LGCVMay 6

Balancing Stability and Plasticity in Sequentially Trained Early-Exiting Neural Networks

arXiv:2605.0535837.2h-index: 32
Predicted impact top 64% in LG · last 90 daysOriginality Synthesis-oriented
AI Analysis

For practitioners using early-exit networks, this addresses the stability-plasticity trade-off to maintain performance of earlier exits during sequential training.

Early-exiting neural networks suffer from interference when exits are trained sequentially, degrading earlier classifiers. The proposed methods, based on Elastic Weight Consolidation and Learning without Forgetting, improve early-exit accuracy and achieve speedups at low computational budgets.

Early-exiting neural networks enable adaptive inference by allowing inputs to exit at intermediate classifiers, reducing computation for easy samples while maintaining high accuracy. In practice, exits can be trained sequentially by incrementally adding them to a shared backbone; however, this sequential training can cause newly introduced exits to interfere with previously learned ones, degrading the performance of earlier classifiers. We address this problem by retaining the knowledge embedded in existing exits while allowing new ones to specialize. We propose two alternative approaches that operate at different levels of the model. The first constrains learning by protecting parameters that are important for previously trained exits, while the second preserves the output distributions of earlier exits as the network adapts. These alternatives directly reflect the stability-plasticity trade-off studied in continual learning. Accordingly, we leverage \textit{Elastic Weight Consolidation} to constrain critical weights and \textit{Learning without Forgetting} to preserve output distributions. Experiments on standard benchmarks show that our approaches consistently improve early-exit performance, achieving higher accuracy over existing sequential training methods and significant performance speedups at low computational budgets.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes