TwinTURBO: Semi-Supervised Fine-Tuning of Foundation Models via Mutual Information Decompositions for Downstream Task and Latent Spaces
This work addresses the challenge of data efficiency for practitioners using foundation models, though it appears incremental as it builds on existing fine-tuning and semi-supervised methods.
The paper tackles the problem of fine-tuning foundation models with limited labeled data by proposing a semi-supervised framework that uses mutual information decomposition to optimize both downstream task and latent spaces, resulting in significant improvements in classification tasks under low-labeled conditions.
We present a semi-supervised fine-tuning framework for foundation models that utilises mutual information decomposition to address the challenges of training for a limited amount of labelled data. Our approach derives two distinct lower bounds: i) for the downstream task space, such as classification, optimised using conditional and marginal cross-entropy alongside Kullback-Leibler divergence, and ii) for the latent space representation, regularised and aligned using a contrastive-like decomposition. This fine-tuning strategy retains the pre-trained structure of the foundation model, modifying only a specialised projector module comprising a small transformer and a token aggregation technique. Experiments on several datasets demonstrate significant improvements in classification tasks under extremely low-labelled conditions by effectively leveraging unlabelled data.