Fast Adaptation with Linearized Neural Networks
This work addresses the problem of efficient and interpretable domain adaptation for machine learning practitioners, offering a novel approach that avoids local optima issues in fine-tuning.
The paper tackles the challenge of understanding and adapting neural network inductive biases by proposing a method that uses linearized neural networks to embed these biases into Gaussian processes, enabling interpretable domain adaptation with analytic inference and uncertainty estimation. Experiments on image classification and regression show the framework's promise for transfer learning compared to neural network fine-tuning.
The inductive biases of trained neural networks are difficult to understand and, consequently, to adapt to new settings. We study the inductive biases of linearizations of neural networks, which we show to be surprisingly good summaries of the full network functions. Inspired by this finding, we propose a technique for embedding these inductive biases into Gaussian processes through a kernel designed from the Jacobian of the network. In this setting, domain adaptation takes the form of interpretable posterior inference, with accompanying uncertainty estimation. This inference is analytic and free of local optima issues found in standard techniques such as fine-tuning neural network weights to a new task. We develop significant computational speed-ups based on matrix multiplies, including a novel implementation for scalable Fisher vector products. Our experiments on both image classification and regression demonstrate the promise and convenience of this framework for transfer learning, compared to neural network fine-tuning. Code is available at https://github.com/amzn/xfer/tree/master/finite_ntk.