Transfer Learning Through Conditional Quantile Matching
This work addresses the challenge of data scarcity in regression tasks for machine learning practitioners, offering a flexible approach to transfer learning without restrictive assumptions.
The paper tackles the problem of improving predictive performance in data-scarce target domains by leveraging heterogeneous source domains through a transfer learning framework, achieving consistent accuracy improvements over target-only learning and competing methods in simulations and real data applications.
We introduce a transfer learning framework for regression that leverages heterogeneous source domains to improve predictive performance in a data-scarce target domain. Our approach learns a conditional generative model separately for each source domain and calibrates the generated responses to the target domain via conditional quantile matching. This distributional alignment step corrects general discrepancies between source and target domains without imposing restrictive assumptions such as covariate or label shift. The resulting framework provides a principled and flexible approach to high-quality data augmentation for downstream learning tasks in the target domain. From a theoretical perspective, we show that an empirical risk minimizer (ERM) trained on the augmented dataset achieves a tighter excess risk bound than the target-only ERM under mild conditions. In particular, we establish new convergence rates for the quantile matching estimator that governs the transfer bias-variance tradeoff. From a practical perspective, extensive simulations and real data applications demonstrate that the proposed method consistently improves prediction accuracy over target-only learning and competing transfer learning methods.