Does Optimal Source Task Performance Imply Optimal Pre-training for a Target Task?
This work addresses a fundamental issue in transfer learning for deep learning practitioners, revealing an incremental but practical insight into pre-training strategies.
The paper challenges the assumption that optimal source task performance yields the best pre-trained models for fine-tuning, showing that stopping training early can improve target task performance and even relearning of the source task, with experiments demonstrating this effect across various training conditions.
Fine-tuning of pre-trained deep nets is commonly used to improve accuracies and training times for neural nets. It is generally assumed that pre-training a net for optimal source task performance best prepares it for fine-tuning to learn an arbitrary target task. This is generally not true. Stopping source task training, prior to optimal performance, can create a pre-trained net better suited for fine-tuning to learn a new task. We perform several experiments demonstrating this effect, as well as the influence of the amount of training and of learning rate. Additionally, our results indicate that this reflects a general loss of learning ability that even extends to relearning the source task.