CLSep 28, 2022

Supervised Contrastive Learning as Multi-Objective Optimization for Fine-Tuning Large Pre-trained Language Models

arXiv:2209.14161v17 citationsh-index: 22
Originality Incremental advance
AI Analysis

This work addresses optimization challenges in fine-tuning large language models for classification tasks, representing an incremental improvement over existing SCL methods.

The paper tackles the conflict between objectives in Supervised Contrastive Learning (SCL) by formulating it as a Multi-Objective Optimization problem during fine-tuning of RoBERTa, achieving significant performance improvements on GLUE benchmark tasks without data augmentations or adversarial examples.

Recently, Supervised Contrastive Learning (SCL) has been shown to achieve excellent performance in most classification tasks. In SCL, a neural network is trained to optimize two objectives: pull an anchor and positive samples together in the embedding space, and push the anchor apart from the negatives. However, these two different objectives may conflict, requiring trade-offs between them during optimization. In this work, we formulate the SCL problem as a Multi-Objective Optimization problem for the fine-tuning phase of RoBERTa language model. Two methods are utilized to solve the optimization problem: (i) the linear scalarization (LS) method, which minimizes a weighted linear combination of pertask losses; and (ii) the Exact Pareto Optimal (EPO) method which finds the intersection of the Pareto front with a given preference vector. We evaluate our approach on several GLUE benchmark tasks, without using data augmentations, memory banks, or generating adversarial examples. The empirical results show that the proposed learning strategy significantly outperforms a strong competitive contrastive learning baseline

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes