VISP: Volatility Informed Stochastic Projection for Adaptive Regularization
This addresses overfitting for deep learning practitioners, offering an incremental improvement over existing regularization techniques.
The paper tackles overfitting in deep neural networks by proposing VISP, an adaptive regularization method that uses gradient volatility to guide stochastic noise injection, resulting in improved generalization performance on datasets like MNIST, CIFAR-10, and SVHN over baseline models and fixed-noise alternatives.
We propose VISP: Volatility Informed Stochastic Projection, an adaptive regularization method that leverages gradient volatility to guide stochastic noise injection in deep neural networks. Unlike conventional techniques that apply uniform noise or fixed dropout rates, VISP dynamically computes volatility from gradient statistics and uses it to scale a stochastic projection matrix. This mechanism selectively regularizes inputs and hidden nodes that exhibit higher gradient volatility while preserving stable representations, thereby mitigating overfitting. Extensive experiments on MNIST, CIFAR-10, and SVHN demonstrate that VISP consistently improves generalization performance over baseline models and fixed-noise alternatives. In addition, detailed analyses of the evolution of volatility, the spectral properties of the projection matrix, and activation distributions reveal that VISP not only stabilizes the internal dynamics of the network but also fosters a more robust feature representation.