LG AIApr 25, 2023

Towards Compute-Optimal Transfer Learning

Massimo Caccia, Alexandre Galashov, Arthur Douillard, Amal Rannen-Triki, Dushyant Rao, Michela Paganini, Laurent Charlin, Marc'Aurelio Ranzato, Razvan Pascanu

DeepMind

arXiv:2304.13164v16.64 citationsh-index: 75

Originality Incremental advance

AI Analysis

This addresses the problem of computational inefficiency in transfer learning for users of large models, offering an incremental improvement by applying pruning to enhance compute efficiency.

The paper tackles the high computational and memory requirements of finetuning large pretrained models in transfer learning by proposing zero-shot structured pruning to trade computational efficiency for asymptotic performance, showing that pruning convolutional filters can lead to more than 20% performance improvement in low computational regimes on the Nevis'22 benchmark.

The field of transfer learning is undergoing a significant shift with the introduction of large pretrained models which have demonstrated strong adaptability to a variety of downstream tasks. However, the high computational and memory requirements to finetune or use these models can be a hindrance to their widespread use. In this study, we present a solution to this issue by proposing a simple yet effective way to trade computational efficiency for asymptotic performance which we define as the performance a learning algorithm achieves as compute tends to infinity. Specifically, we argue that zero-shot structured pruning of pretrained models allows them to increase compute efficiency with minimal reduction in performance. We evaluate our method on the Nevis'22 continual learning benchmark that offers a diverse set of transfer scenarios. Our results show that pruning convolutional filters of pretrained models can lead to more than 20% performance improvement in low computational regimes.

View on arXiv PDF

Similar