LGSep 19, 2023

Mixture Weight Estimation and Model Prediction in Multi-source Multi-target Domain Adaptation

Yuyang Deng, Ilja Kuzborskij, Mehrdad Mahdavi

arXiv:2309.10736v23.82 citationsh-index: 33

Originality Incremental advance

AI Analysis

This work addresses computational efficiency and theoretical guarantees in multi-source multi-target domain adaptation, which is incremental but improves upon existing methods for machine learning practitioners dealing with heterogeneous data.

The paper tackles the problem of learning from multiple heterogeneous sources to perform well on new target distributions by estimating optimal mixture weights and efficiently predicting models for numerous target domains. It proposes a stochastic algorithm for mixture weight estimation with provable guarantees and shows that a GD-trained overparameterized neural network can predict target models without solving individual ERM problems, with regret guarantees in an online setting.

We consider the problem of learning a model from multiple heterogeneous sources with the goal of performing well on a new target distribution. The goal of learner is to mix these data sources in a target-distribution aware way and simultaneously minimize the empirical risk on the mixed source. The literature has made some tangible advancements in establishing theory of learning on mixture domain. However, there are still two unsolved problems. Firstly, how to estimate the optimal mixture of sources, given a target domain; Secondly, when there are numerous target domains, how to solve empirical risk minimization (ERM) for each target using possibly unique mixture of data sources in a computationally efficient manner. In this paper we address both problems efficiently and with guarantees. We cast the first problem, mixture weight estimation, as a convex-nonconcave compositional minimax problem, and propose an efficient stochastic algorithm with provable stationarity guarantees. Next, for the second problem, we identify that for certain regimes, solving ERM for each target domain individually can be avoided, and instead parameters for a target optimal model can be viewed as a non-linear function on a space of the mixture coefficients. Building upon this, we show that in the offline setting, a GD-trained overparameterized neural network can provably learn such function to predict the model of target domain instead of solving a designated ERM problem. Finally, we also consider an online setting and propose a label efficient online algorithm, which predicts parameters for new targets given an arbitrary sequence of mixing coefficients, while enjoying regret guarantees.

View on arXiv PDF

Similar