Learning Deep Representations with Probabilistic Knowledge Transfer
This addresses the problem of extending knowledge transfer to broader representation learning tasks for researchers and practitioners in machine learning, though it appears incremental in expanding beyond classification.
The paper tackles the limitation of existing knowledge transfer methods being restricted to classification tasks by proposing a novel technique that trains a student model to maintain the same mutual information between learned representations and labels as the teacher model, enabling applications like cross-modal transfer from text to visual data.
Knowledge Transfer (KT) techniques tackle the problem of transferring the knowledge from a large and complex neural network into a smaller and faster one. However, existing KT methods are tailored towards classification tasks and they cannot be used efficiently for other representation learning tasks. In this paper a novel knowledge transfer technique, that is capable of training a student model that maintains the same amount of mutual information between the learned representation and a set of (possible unknown) labels as the teacher model, is proposed. Apart from outperforming existing KT techniques, the proposed method allows for overcoming several limitations of existing methods providing new insight into KT as well as novel KT applications, ranging from knowledge transfer from handcrafted feature extractors to {cross-modal} KT from the textual modality into the representation extracted from the visual modality of the data.