SOSELETO: A Unified Approach to Transfer Learning and Training with Noisy Labels
This addresses the challenge of efficiently leveraging source data for improved target performance in machine learning, with applications in domains like computer vision and natural language processing, though it appears incremental as it builds on existing transfer learning and noisy label techniques.
The paper tackles the problem of selecting informative source examples for target classification tasks by introducing SOSELETO, a method that uses bilevel optimization to jointly learn source sample weights and classification models, achieving state-of-the-art results in transfer learning and training with noisy labels.
We present SOSELETO (SOurce SELEction for Target Optimization), a new method for exploiting a source dataset to solve a classification problem on a target dataset. SOSELETO is based on the following simple intuition: some source examples are more informative than others for the target problem. To capture this intuition, source samples are each given weights; these weights are solved for jointly with the source and target classification problems via a bilevel optimization scheme. The target therefore gets to choose the source samples which are most informative for its own classification task. Furthermore, the bilevel nature of the optimization acts as a kind of regularization on the target, mitigating overfitting. SOSELETO may be applied to both classic transfer learning, as well as the problem of training on datasets with noisy labels; we show state of the art results on both of these problems.