LGMLMay 8, 2020

Automatic Cross-Domain Transfer Learning for Linear Regression

arXiv:2005.04088v51 citations
Originality Incremental advance
AI Analysis

It addresses a domain-specific challenge in machine learning by enabling more robust transfer learning in scenarios with uncertain domain labels, though it is incremental as it builds on existing transfer learning frameworks.

This paper tackles the problem of transfer learning for linear regression when domain information is uncertain or unknown, by inferring latent domains using a Dirichlet process and transferring both explanatory and response variables, resulting in improved performance on real datasets with controlled bias compared to previous methods.

Transfer learning research attempts to make model induction transferable across different domains. This method assumes that specific information regarding to which domain each instance belongs is known. This paper helps to extend the capability of transfer learning for linear regression problems to situations where the domain information is uncertain or unknown; in fact, the framework can be extended to classification problems. For normal datasets, we assume that some latent domain information is available for transfer learning. The instances in each domain can be inferred by different parameters. We obtain this domain information from the distribution of the regression coefficients corresponding to the explanatory variable $x$ as well as the response variable $y$ based on a Dirichlet process, which is more reasonable. As a result, we transfer not only variable $x$ as usual but also variable $y$, which is challenging since the testing data have no response value. Previous work mainly overcomes the problem via pseudo-labelling based on transductive learning, which introduces serious bias. We provide a novel framework for analysing the problem and considering this general situation: the joint distribution of variable $x$ and variable $y$. Furthermore, our method controls the bias well compared with previous work. We perform linear regression on the new feature space that consists of different latent domains and the target domain, which is from the testing data. The experimental results show that the proposed model performs well on real datasets.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes