Deep Transfer Learning with Ridge Regression
This addresses the computational cost issue for researchers and practitioners using deep learning in transfer learning, though it is incremental as it builds on existing methods.
The paper tackles the computational bottleneck of finetuning in deep transfer learning by using kernel ridge regression with low-rank features, eliminating finetuning and reducing hyperparameters, achieving success on supervised and semi-supervised tasks.
The large amount of online data and vast array of computing resources enable current researchers in both industry and academia to employ the power of deep learning with neural networks. While deep models trained with massive amounts of data demonstrate promising generalisation ability on unseen data from relevant domains, the computational cost of finetuning gradually becomes a bottleneck in transfering the learning to new domains. We address this issue by leveraging the low-rank property of learnt feature vectors produced from deep neural networks (DNNs) with the closed-form solution provided in kernel ridge regression (KRR). This frees transfer learning from finetuning and replaces it with an ensemble of linear systems with many fewer hyperparameters. Our method is successful on supervised and semi-supervised transfer learning tasks.