Symmetry-Aware Convex Shrinkage for High-Dimensional Covariance Estimation

arXiv:2605.1711135.11 citations

AI Analysis

For practitioners in finance, climate, genomics, and other fields requiring robust covariance estimation, this work provides a principled way to incorporate structural priors via symmetry groups, improving upon the widely-used Ledoit-Wolf estimator when such structure exists.

The paper develops a class of data-adaptive convex shrinkage estimators for high-dimensional covariance estimation that selects a symmetry group from a candidate library and uses its projection as the shrinkage target. The method generalizes Ledoit-Wolf shrinkage and group-symmetric MLE, and is shown to dominate Ledoit-Wolf in Frobenius MSE under a sufficient-match condition, with empirical validation on six real-world datasets.

We develop a class of data-adaptive shrinkage estimators for high-dimensional covariance estimation in which the shrinkage target is a Reynolds projection of the sample covariance under a finite symmetry group selected from a candidate library by held-out predictive performance. The class generalizes the convex shrinkage estimator of Ledoit and Wolf by replacing the scalar-identity target with a structured target derived from a symmetry group when one is available, and generalizes the group-symmetric maximum-likelihood estimator of Shah and Chandrasekaran by combining structural targeting with adaptive convex shrinkage and by selecting the group from data rather than treating it as prespecified. A two-tier procedure performs the group selection: a universal per-candidate evaluation based on held-out negative log-likelihood, optionally preceded by a domain-specific step that constructs the candidate library from structural priors. We establish a finite-sample regret bound for the held-out calibration of the convex combination weight, an oracle inequality for the data-driven group selection, and a quantitative sufficient-match condition under which the proposed estimator dominates Ledoit-Wolf shrinkage in Frobenius mean-squared error. The procedure is illustrated on six real-data problems spanning finance (S&P~500 daily returns), climate (NOAA OISST sea-surface temperature anomalies), genomics (TCGA-BRCA gene expression), radio signal processing (RadioML 2018.A), astronomical imaging (Galaxy10 DECaLS), and natural image patches (CIFAR-10 with a CIFAR-10.1 distribution-shift companion). An empirical comparison is also made against the Bayesian permutation-symmetry estimator of Chojecki and colleagues. Outside the few-shot regime, where structural priors carry the most information per observation, Ledoit-Wolf shrinkage remains the appropriate baseline.

View on arXiv PDF

Similar