LGAIMar 3

Joint Training Across Multiple Activation Sparsity Regimes

arXiv:2603.03131v1h-index: 4
Originality Incremental advance
AI Analysis

This work addresses generalization challenges in deep learning, but it is incremental as it builds on existing sparsity and training methods.

The paper tackled the problem of improving generalization in deep neural networks by training models to maintain effective representations across both dense and sparse activation regimes, finding that two adaptive strategies outperformed a dense baseline on CIFAR-10 without data augmentation.

Generalization in deep neural networks remains only partially understood. Inspired by the stronger generalization tendency of biological systems, we explore the hypothesis that robust internal representations should remain effective across both dense and sparse activation regimes. To test this idea, we introduce a simple training strategy that applies global top-k constraints to hidden activations and repeatedly cycles a single model through multiple activation budgets via progressive compression and periodic reset. Using CIFAR-10 without data augmentation and a WRN-28-4 backbone, we find in single-run experiments that two adaptive keep-ratio control strategies both outperform dense baseline training. These preliminary results suggest that joint training across multiple activation sparsity regimes may provide a simple and effective route to improved generalization.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes