MEMar 11
Partition-Based Functional Ridge Regression for High-Dimensional DataShaista Ashraf, Ismail Shah, Farrukh Javed
This paper proposes a partition-based functional ridge regression framework to address multicollinearity, overfitting, and interpretability in high-dimensional functional linear models. The coefficient function vector \( \boldsymbolβ(s) \) is decomposed into two components, \( \boldsymbolβ_1(s) \) and \( \boldsymbolβ_2(s) \), representing dominant and weaker functional effects. This partition enables differential ridge penalization across functional blocks, so that important signals are preserved while less informative components are more strongly shrunk. The resulting approach improves numerical stability and enhances interpretability without relying on explicit variable selection. We develop three estimators: the Functional Ridge Estimator (FRE), the Functional Ridge Full Model (FRFM), and the Functional Ridge Sub-Model (FRSM). Under standard regularity conditions, we establish consistency and asymptotic normality for all estimators. Simulation results reveal a clear bias--variance trade-off where FRSM performs best in small samples through strong variance reduction, whereas FRFM achieves superior accuracy in moderate to large samples by retaining informative functional structure through adaptive penalization. An empirical application to Canadian weather data further demonstrates improved predictive performance, reduced variance inflation, and clearer identification of influential functional effects. Overall, partition-based ridge regularization provides a practical and theoretically grounded method for high-dimensional functional regression.
LGMar 11
Beyond Accuracy: Reliability and Uncertainty Estimation in Convolutional Neural NetworksSanne Ruijs, Alina Kosiakova, Farrukh Javed
Deep neural networks (DNNs) have become integral to a wide range of scientific and practical applications due to their flexibility and strong predictive performance. Despite their accuracy, however, DNNs frequently exhibit poor calibration, often assigning overly confident probabilities to incorrect predictions. This limitation underscores the growing need for integrated mechanisms that provide reliable uncertainty estimation. In this article, we compare two prominent approaches for uncertainty quantification: a Bayesian approximation via Monte Carlo Dropout and the nonparametric Conformal Prediction framework. Both methods are assessed using two convolutional neural network architectures; H-CNN VGG16 and GoogLeNet, trained on the Fashion-MNIST dataset. The empirical results show that although H-CNN VGG16 attains higher predictive accuracy, it tends to exhibit pronounced overconfidence, whereas GoogLeNet yields better-calibrated uncertainty estimates. Conformal Prediction additionally demonstrates consistent validity by producing statistically guaranteed prediction sets, highlighting its practical value in high-stakes decision-making contexts. Overall, the findings emphasize the importance of evaluating model performance beyond accuracy alone and contribute to the development of more reliable and trustworthy deep learning systems.
MLApr 30
Adaptive Norm-Based Regularization for Neural NetworksMuhammad Qasim, Farrukh Javed
In this paper, we study norm-based regularization methods for neural networks. We compare existing penalization approaches and introduce two regularization strategies that extend classical ridge- and lasso-type penalties to neural network models. The first strategy modifies weight decay by incorporating the covariance structure of the input features into a ridge-type $\ell_2$ penalty, allowing regularization to account for feature dependence. The second combines an $\ell_1$ sparsity penalty with covariance-aware $\ell_2$ regularization, producing neural network weights that are both sparse and structurally informed. Monte Carlo simulations are used to evaluate these methods under different data-generating settings, followed by two real-data applications on building cooling-load prediction and leukemia cell-type classification from high-dimensional gene expression data. Across simulated and real-data examples, the proposed regularizers improve predictive performance on unseen data and provide more effective complexity control than standard norm-based penalties, particularly when features are correlated or high-dimensional.