LGNov 12, 2025
PDAC: Efficient Coreset Selection for Continual Learning via Probability Density AwarenessJunqi Gao, Zhichang Guo, Dazhi Zhang et al.
Rehearsal-based Continual Learning (CL) maintains a limited memory buffer to store replay samples for knowledge retention, making these approaches heavily reliant on the quality of the stored samples. Current Rehearsal-based CL methods typically construct the memory buffer by selecting a representative subset (referred to as coresets), aiming to approximate the training efficacy of the full dataset with minimal storage overhead. However, mainstream Coreset Selection (CS) methods generally formulate the CS problem as a bi-level optimization problem that relies on numerous inner and outer iterations to solve, leading to substantial computational cost thus limiting their practical efficiency. In this paper, we aim to provide a more efficient selection logic and scheme for coreset construction. To this end, we first analyze the Mean Squared Error (MSE) between the buffer-trained model and the Bayes-optimal model through the perspective of localized error decomposition to investigate the contribution of samples from different regions to MSE suppression. Further theoretical and experimental analyses demonstrate that samples with high probability density play a dominant role in error suppression. Inspired by this, we propose the Probability Density-Aware Coreset (PDAC) method. PDAC leverages the Projected Gaussian Mixture (PGM) model to estimate each sample's joint density, enabling efficient density-prioritized buffer selection. Finally, we introduce the streaming Expectation Maximization (EM) algorithm to enhance the adaptability of PGM parameters to streaming data, yielding Streaming PDAC (SPDAC) for streaming scenarios. Extensive comparative experiments show that our methods outperforms other baselines across various CL settings while ensuring favorable efficiency.
LGApr 30
Auto-FlexSwitch: Efficient Dynamic Model Merging via Learnable Task Vector CompressionJunqi Gao, Dazhi Zhang, Zhichang Guo et al.
Model merging has attracted attention as an effective path toward multi-task adaptation by integrating knowledge from multiple task-specific models. Among existing approaches, dynamic merging mitigates performance degradation caused by conflicting parameter updates across tasks by flexibly combining task-specific parameters at inference time, thereby maintaining high performance. However, these methods require storing independent parameters for each task, resulting in prohibitive storage overhead. To address this issue, we first experimentally demonstrate that the fine-tuned weight increments (referred to as task vectors) exhibit an impulse-like activation pattern and high robustness to low-bit representations. Driven by this insight, we propose T-Switch, which decomposes task vectors into three compact components: a binary sparse mask, a sign vector, and a scalar scaling factor, achieving high-fidelity approximation at high compression ratios. We then introduce Auto-Switch, a training-free merging scheme that automatically composes task vectors via feature similarity retrieval. Building on this, we develop Auto-Switch, a training-free merging scheme that automatically assembles task vectors through feature similarity retrieval. Furthermore, to transform task vector sparsification and quantization from static rules to adaptive learning, we propose FlexSwitch, a learnable framework which jointly optimizes the compression strategy for each model unit via Learnable Gating Sparsification (LGS) and Bit-width Adaptive Selection (BAS), while employing the Sparsity-Aware Storage Strategy (SASS) to select the optimal storage encoding structure. Finally, by incorporating a K-Nearest Neighbor (KNN) inference scheme with a learnable low-rank metric, we present Auto-FlexSwitch, a dynamic model merging approach that supports highly efficient task vector compression.
CVNov 24, 2024
A Tunable Despeckling Neural Network Stabilized via Diffusion EquationYi Ran, Zhichang Guo, Jia Li et al.
The removal of multiplicative Gamma noise is a critical research area in the application of synthetic aperture radar (SAR) imaging, where neural networks serve as a potent tool. However, real-world data often diverges from theoretical models, exhibiting various disturbances, which makes the neural network less effective. Adversarial attacks can be used as a criterion for judging the adaptability of neural networks to real data, since adversarial attacks can find the most extreme perturbations that make neural networks ineffective. In this work, the diffusion equation is designed as a regularization block to provide sufficient regularity to the whole neural network, due to its spontaneous dissipative nature. We propose a tunable, regularized neural network framework that unrolls a shallow denoising neural network block and a diffusion regularity block into a single network for end-to-end training. The linear heat equation, known for its inherent smoothness and low-pass filtering properties, is adopted as the diffusion regularization block. In our model, a single time step hyperparameter governs the smoothness of the outputs and can be adjusted dynamically, significantly enhancing flexibility. The stability and convergence of our model are theoretically proven. Experimental results demonstrate that the proposed model effectively eliminates high-frequency oscillations induced by adversarial attacks. Finally, the proposed model is benchmarked against several state-of-the-art denoising methods on simulated images, adversarial samples, and real SAR images, achieving superior performance in both quantitative and visual evaluations.