Robust Transfer Learning with Unreliable Source Data
This work addresses the challenge of unreliable source data in transfer learning for machine learning practitioners, offering a robust method to prevent negative transfer, but it is incremental as it builds on existing transfer learning frameworks.
The paper tackles the problem of robust transfer learning when source data is unreliable by introducing an 'ambiguity level' to measure discrepancies between target and source distributions, and proposes the Transfer Around Boundary (TAB) model to improve classification while avoiding negative transfer, achieving optimal upper bounds up to logarithmic factors in tasks like non-parametric classification and logistic regression.
This paper addresses challenges in robust transfer learning stemming from ambiguity in Bayes classifiers and weak transferable signals between the target and source distribution. We introduce a novel quantity called the ''ambiguity level'' that measures the discrepancy between the target and source regression functions, propose a simple transfer learning procedure, and establish a general theorem that shows how this new quantity is related to the transferability of learning in terms of risk improvements. Our proposed ''Transfer Around Boundary'' (TAB) model, with a threshold balancing the performance of target and source data, is shown to be both efficient and robust, improving classification while avoiding negative transfer. Moreover, we demonstrate the effectiveness of the TAB model on non-parametric classification and logistic regression tasks, achieving upper bounds which are optimal up to logarithmic factors. Simulation studies lend further support to the effectiveness of TAB. We also provide simple approaches to bound the excess misclassification error without the need for specialized knowledge in transfer learning.