LG ME MLFeb 12, 2025

Knowledge-Guided Wasserstein Distributionally Robust Optimization

Zitao Wang, Ziyuan Wang, Molei Liu, Nian Si

arXiv:2502.08146v111.44 citationsh-index: 1ICML

Originality Incremental advance

AI Analysis

This work addresses the challenge of statistical efficiency in transfer learning with limited target data, offering a method to reduce pessimism in WDRO, which is incremental but provides a novel interpretation.

The authors tackled the problem of overly conservative Wasserstein Distributionally Robust Optimization (WDRO) in transfer learning by proposing KG-WDRO, which adaptively incorporates external knowledge to construct smaller ambiguity sets, resulting in improved performance in small-sample scenarios as demonstrated in simulations.

Transfer learning is a popular strategy to leverage external knowledge and improve statistical efficiency, particularly with a limited target sample. We propose a novel knowledge-guided Wasserstein Distributionally Robust Optimization (KG-WDRO) framework that adaptively incorporates multiple sources of external knowledge to overcome the conservativeness of vanilla WDRO, which often results in overly pessimistic shrinkage toward zero. Our method constructs smaller Wasserstein ambiguity sets by controlling the transportation along directions informed by the source knowledge. This strategy can alleviate perturbations on the predictive projection of the covariates and protect against information loss. Theoretically, we establish the equivalence between our WDRO formulation and the knowledge-guided shrinkage estimation based on collinear similarity, ensuring tractability and geometrizing the feasible set. This also reveals a novel and general interpretation for recent shrinkage-based transfer learning approaches from the perspective of distributional robustness. In addition, our framework can adjust for scaling differences in the regression models between the source and target and accommodates general types of regularization such as lasso and ridge. Extensive simulations demonstrate the superior performance and adaptivity of KG-WDRO in enhancing small-sample transfer learning.

View on arXiv PDF

Similar