LGJun 11, 2023

An information-Theoretic Approach to Semi-supervised Transfer Learning

arXiv:2306.06731v11 citationsh-index: 51
Originality Incremental advance
AI Analysis

This work addresses performance issues in transfer learning for scenarios with limited labeled target data, offering a novel theoretical approach that is incremental in application.

The paper tackles the problem of distribution discrepancies in semi-supervised transfer learning by proposing information-theoretic regularization terms based on Mutual Information and Lautum Information to improve neural network transferability, demonstrating effectiveness in experiments.

Transfer learning is a valuable tool in deep learning as it allows propagating information from one "source dataset" to another "target dataset", especially in the case of a small number of training examples in the latter. Yet, discrepancies between the underlying distributions of the source and target data are commonplace and are known to have a substantial impact on algorithm performance. In this work we suggest novel information-theoretic approaches for the analysis of the performance of deep neural networks in the context of transfer learning. We focus on the task of semi-supervised transfer learning, in which unlabeled samples from the target dataset are available during network training on the source dataset. Our theory suggests that one may improve the transferability of a deep neural network by incorporating regularization terms on the target data based on information-theoretic quantities, namely the Mutual Information and the Lautum Information. We demonstrate the effectiveness of the proposed approaches in various semi-supervised transfer learning experiments.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes