Adaptive Norm-Based Regularization for Neural Networks
For practitioners training neural networks on high-dimensional or correlated data, this work offers improved regularization techniques that outperform standard weight decay and lasso penalties.
The paper introduces two norm-based regularization strategies for neural networks—a covariance-aware ridge penalty and a combined l1-sparsity with covariance-aware l2 penalty—that improve predictive performance and complexity control over standard methods, especially with correlated or high-dimensional features.
In this paper, we study norm-based regularization methods for neural networks. We compare existing penalization approaches and introduce two regularization strategies that extend classical ridge- and lasso-type penalties to neural network models. The first strategy modifies weight decay by incorporating the covariance structure of the input features into a ridge-type $\ell_2$ penalty, allowing regularization to account for feature dependence. The second combines an $\ell_1$ sparsity penalty with covariance-aware $\ell_2$ regularization, producing neural network weights that are both sparse and structurally informed. Monte Carlo simulations are used to evaluate these methods under different data-generating settings, followed by two real-data applications on building cooling-load prediction and leukemia cell-type classification from high-dimensional gene expression data. Across simulated and real-data examples, the proposed regularizers improve predictive performance on unseen data and provide more effective complexity control than standard norm-based penalties, particularly when features are correlated or high-dimensional.