LGCVMar 1, 2018

Knowledge Transfer with Jacobian Matching

arXiv:1803.00443v1180 citations
Originality Incremental advance
AI Analysis

This work addresses the challenge of efficient knowledge transfer for neural network practitioners, but it is incremental as it builds on existing Jacobian matching methods.

The paper tackled the problem of improving knowledge transfer in neural networks by deriving appropriate loss functions for Jacobian matching from an equivalence with distillation using input noise, and showed experimentally that this approach enhances distillation, robustness to noisy inputs, and transfer learning on standard image datasets.

Classical distillation methods transfer representations from a "teacher" neural network to a "student" network by matching their output activations. Recent methods also match the Jacobians, or the gradient of output activations with the input. However, this involves making some ad hoc decisions, in particular, the choice of the loss function. In this paper, we first establish an equivalence between Jacobian matching and distillation with input noise, from which we derive appropriate loss functions for Jacobian matching. We then rely on this analysis to apply Jacobian matching to transfer learning by establishing equivalence of a recent transfer learning procedure to distillation. We then show experimentally on standard image datasets that Jacobian-based penalties improve distillation, robustness to noisy inputs, and transfer learning.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes