CLMar 7, 2025

Similarity-Based Domain Adaptation with LLMs

arXiv:2503.05281v11 citationsh-index: 5
Originality Incremental advance
AI Analysis

This addresses domain adaptation for text classification, offering a more efficient approach by eliminating source model training, though it appears incremental as it builds on existing LLM capabilities.

The paper tackles the problem of unsupervised domain adaptation by proposing a framework that uses Large Language Models (LLMs) for target data annotation without source model training, achieving a 2.44% accuracy improvement over the state-of-the-art method.

Unsupervised domain adaptation leverages abundant labeled data from various source domains to generalize onto unlabeled target data. Prior research has primarily focused on learning domain-invariant features across the source and target domains. However, these methods often require training a model using source domain data, which is time-consuming and can limit model usage for applications with different source data. This paper introduces a simple framework that utilizes the impressive generalization capabilities of Large Language Models (LLMs) for target data annotation without the need of source model training, followed by a novel similarity-based knowledge distillation loss. Our extensive experiments on cross-domain text classification reveal that our framework achieves impressive performance, specifically, 2.44\% accuracy improvement when compared to the SOTA method.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes