LGApr 12

Exact Finite-Sample Variance Decomposition of Subagging: A Spectral Filtering Perspective

arXiv:2604.1046923.0h-index: 4
AI Analysis

For ensemble learning practitioners, this provides a theoretical understanding of how subsampling ratios interact with base learner complexity, enabling better calibration of subagging for improved generalization.

The paper derives the first exact finite-sample variance decomposition for subagging, showing it acts as a low-pass spectral filter that attenuates high-order interaction variance by a geometric factor. A complexity-guided adaptive subsampling algorithm is proposed, which improves generalization over static baselines.

Standard resampling ratios (e.g., $α\approx 0.632$) are widely used as default baselines in ensemble learning for three decades. However, how these ratios interact with a base learner's intrinsic functional complexity in finite samples lacks a exact mathematical characterization. We leverage the Hoeffding-ANOVA decomposition to derive the first exact, finite-sample variance decomposition for subagging, applicable to any symmetric base learner without requiring asymptotic limits or smoothness assumptions. We establish that subagging operates as a deterministic low-pass spectral filter: it preserves low-order structural signals while attenuating $c$-th order interaction variance by a geometric factor approaching $α^c$. This decoupling reveals why default baselines often under-regularize high-capacity interpolators, which instead require smaller $α$ to exponentially suppress spurious high-order noise. To operationalize these insights, we propose a complexity-guided adaptive subsampling algorithm, empirically demonstrating that dynamically calibrating $α$ to the learner's complexity spectrum consistently improves generalization over static baselines.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes