ML LGJul 1, 2023

Unified Transfer Learning Models in High-Dimensional Linear Regression

arXiv:2307.00238v47.47 citationsh-index: 3

Originality Incremental advance

AI Analysis

This work addresses the challenge of data scarcity and distribution heterogeneity in transfer learning, offering an interpretable solution with potential applications in domains like social mobility analysis, though it appears incremental as it builds on existing transfer learning frameworks.

The paper tackles the problem of transfer learning in high-dimensional linear regression with heterogeneous source and target data by developing UTrans, a unified model that detects transferable variables and source data, resulting in lower estimation and prediction errors compared to existing methods.

Transfer learning plays a key role in modern data analysis when: (1) the target data are scarce but the source data are sufficient; (2) the distributions of the source and target data are heterogeneous. This paper develops an interpretable unified transfer learning model, termed as UTrans, which can detect both transferable variables and source data. More specifically, we establish the estimation error bounds and prove that our bounds are lower than those with target data only. Besides, we propose a source detection algorithm based on hypothesis testing to exclude the nontransferable data. We evaluate and compare UTrans to the existing algorithms in multiple experiments. It is shown that UTrans attains much lower estimation and prediction errors than the existing methods, while preserving interpretability. We finally apply it to the US intergenerational mobility data and compare our proposed algorithms to the classical machine learning algorithms.

View on arXiv PDF

Similar