LGCVMLSep 28, 2020

Scalable Transfer Learning with Expert Models

arXiv:2009.13239v170 citations
AI Analysis

This work addresses the need for scalable and efficient transfer learning in vision tasks, offering a practical solution with significant computational savings.

The paper tackles the problem of generic pre-trained representations not being tailored to specific downstream tasks by using expert representations selected via cheap performance proxies, resulting in a 2-3 orders of magnitude speed-up and outperforming baselines on over 20 diverse vision tasks.

Transfer of pre-trained representations can improve sample efficiency and reduce computational requirements for new tasks. However, representations used for transfer are usually generic, and are not tailored to a particular distribution of downstream tasks. We explore the use of expert representations for transfer with a simple, yet effective, strategy. We train a diverse set of experts by exploiting existing label structures, and use cheap-to-compute performance proxies to select the relevant expert for each target task. This strategy scales the process of transferring to new tasks, since it does not revisit the pre-training data during transfer. Accordingly, it requires little extra compute per target task, and results in a speed-up of 2-3 orders of magnitude compared to competing approaches. Further, we provide an adapter-based architecture able to compress many experts into a single model. We evaluate our approach on two different data sources and demonstrate that it outperforms baselines on over 20 diverse vision tasks in both cases.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes