Beyond Fine Tuning: A Modular Approach to Learning on Small Data
This addresses the challenge of limited data availability for machine learning practitioners, offering a more effective alternative to fine-tuning, though it appears incremental as it builds on existing transfer learning concepts.
The paper tackles the problem of training neural networks on small datasets by introducing a modular approach that combines pre-trained and untrained modules to learn distribution shifts, surpassing standard fine-tuning methods and significantly improving performance with less data.
In this paper we present a technique to train neural network models on small amounts of data. Current methods for training neural networks on small amounts of rich data typically rely on strategies such as fine-tuning a pre-trained neural network or the use of domain-specific hand-engineered features. Here we take the approach of treating network layers, or entire networks, as modules and combine pre-trained modules with untrained modules, to learn the shift in distributions between data sets. The central impact of using a modular approach comes from adding new representations to a network, as opposed to replacing representations via fine-tuning. Using this technique, we are able surpass results using standard fine-tuning transfer learning approaches, and we are also able to significantly increase performance over such approaches when using smaller amounts of data.