An Adaptive Method Stabilizing Activations for Enhanced Generalization
This work addresses the challenge of enhancing generalization for image classification tasks, but it appears incremental as it builds on existing optimization methods.
The paper tackles the problem of improving generalization in neural networks by introducing AdaAct, an optimization algorithm that adjusts learning rates based on activation variance to stabilize neuron outputs, resulting in competitive performance on CIFAR and ImageNet benchmarks and bridging the gap between Adam's convergence speed and SGD's generalization.
We introduce AdaAct, a novel optimization algorithm that adjusts learning rates according to activation variance. Our method enhances the stability of neuron outputs by incorporating neuron-wise adaptivity during the training process, which subsequently leads to better generalization -- a complementary approach to conventional activation regularization methods. Experimental results demonstrate AdaAct's competitive performance across standard image classification benchmarks. We evaluate AdaAct on CIFAR and ImageNet, comparing it with other state-of-the-art methods. Importantly, AdaAct effectively bridges the gap between the convergence speed of Adam and the strong generalization capabilities of SGD, all while maintaining competitive execution times. Code is available at https://github.com/hseung88/adaact.