LG AIMay 13, 2025

Low-Complexity Inference in Continual Learning via Compressed Knowledge Transfer

Zhenrong Liu, Janne M. J. Huttunen, Mikko Honkala

arXiv:2505.08327v17.11 citationsh-index: 18

Originality Incremental advance

AI Analysis

This work addresses the problem of making continual learning more practical for real-world applications requiring low latency or energy efficiency, though it is incremental as it applies existing compression techniques to a specific setting.

The paper tackles the high computational cost of large pre-trained models in continual learning by proposing pruning and knowledge distillation frameworks for class-incremental learning, achieving a better trade-off between accuracy and inference complexity on multiple benchmarks.

Continual learning (CL) aims to train models that can learn a sequence of tasks without forgetting previously acquired knowledge. A core challenge in CL is balancing stability -- preserving performance on old tasks -- and plasticity -- adapting to new ones. Recently, large pre-trained models have been widely adopted in CL for their ability to support both, offering strong generalization for new tasks and resilience against forgetting. However, their high computational cost at inference time limits their practicality in real-world applications, especially those requiring low latency or energy efficiency. To address this issue, we explore model compression techniques, including pruning and knowledge distillation (KD), and propose two efficient frameworks tailored for class-incremental learning (CIL), a challenging CL setting where task identities are unavailable during inference. The pruning-based framework includes pre- and post-pruning strategies that apply compression at different training stages. The KD-based framework adopts a teacher-student architecture, where a large pre-trained teacher transfers downstream-relevant knowledge to a compact student. Extensive experiments on multiple CIL benchmarks demonstrate that the proposed frameworks achieve a better trade-off between accuracy and inference complexity, consistently outperforming strong baselines. We further analyze the trade-offs between the two frameworks in terms of accuracy and efficiency, offering insights into their use across different scenarios.

View on arXiv PDF

Similar