LGMEMLFeb 12, 2025

Knowledge-Guided Wasserstein Distributionally Robust Optimization

arXiv:2502.08146v14 citationsh-index: 1ICML
Originality Incremental advance
AI Analysis

This work addresses the challenge of statistical efficiency in transfer learning with limited target data, offering a method to reduce pessimism in WDRO, which is incremental but provides a novel interpretation.

The authors tackled the problem of overly conservative Wasserstein Distributionally Robust Optimization (WDRO) in transfer learning by proposing KG-WDRO, which adaptively incorporates external knowledge to construct smaller ambiguity sets, resulting in improved performance in small-sample scenarios as demonstrated in simulations.

Transfer learning is a popular strategy to leverage external knowledge and improve statistical efficiency, particularly with a limited target sample. We propose a novel knowledge-guided Wasserstein Distributionally Robust Optimization (KG-WDRO) framework that adaptively incorporates multiple sources of external knowledge to overcome the conservativeness of vanilla WDRO, which often results in overly pessimistic shrinkage toward zero. Our method constructs smaller Wasserstein ambiguity sets by controlling the transportation along directions informed by the source knowledge. This strategy can alleviate perturbations on the predictive projection of the covariates and protect against information loss. Theoretically, we establish the equivalence between our WDRO formulation and the knowledge-guided shrinkage estimation based on collinear similarity, ensuring tractability and geometrizing the feasible set. This also reveals a novel and general interpretation for recent shrinkage-based transfer learning approaches from the perspective of distributional robustness. In addition, our framework can adjust for scaling differences in the regression models between the source and target and accommodates general types of regularization such as lasso and ridge. Extensive simulations demonstrate the superior performance and adaptivity of KG-WDRO in enhancing small-sample transfer learning.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes