CLNov 1, 2023

AdaSent: Efficient Domain-Adapted Sentence Embeddings for Few-Shot Classification

Yongxin Huang, Kexin Wang, Sourav Dutta, Raj Nath Patel, Goran Glavaš, Iryna Gurevych

arXiv:2311.00408v121.3132 citationsh-index: 22Has Code

Originality Incremental advance

AI Analysis

This work addresses the inefficiency of domain adaptation for sentence encoders in few-shot classification, offering a practical solution for researchers and practitioners, though it is incremental as it builds on existing pre-training methods.

The paper tackles the problem of domain adaptation for sentence embeddings in few-shot classification by proposing AdaSent, which decouples sentence embedding pre-training from domain-adaptive pre-training using an adapter, achieving comparable performance to full retraining while reducing training costs by up to 8.4 points in accuracy improvements.

Recent work has found that few-shot sentence classification based on pre-trained Sentence Encoders (SEs) is efficient, robust, and effective. In this work, we investigate strategies for domain-specialization in the context of few-shot sentence classification with SEs. We first establish that unsupervised Domain-Adaptive Pre-Training (DAPT) of a base Pre-trained Language Model (PLM) (i.e., not an SE) substantially improves the accuracy of few-shot sentence classification by up to 8.4 points. However, applying DAPT on SEs, on the one hand, disrupts the effects of their (general-domain) Sentence Embedding Pre-Training (SEPT). On the other hand, applying general-domain SEPT on top of a domain-adapted base PLM (i.e., after DAPT) is effective but inefficient, since the computationally expensive SEPT needs to be executed on top of a DAPT-ed PLM of each domain. As a solution, we propose AdaSent, which decouples SEPT from DAPT by training a SEPT adapter on the base PLM. The adapter can be inserted into DAPT-ed PLMs from any domain. We demonstrate AdaSent's effectiveness in extensive experiments on 17 different few-shot sentence classification datasets. AdaSent matches or surpasses the performance of full SEPT on DAPT-ed PLM, while substantially reducing the training costs. The code for AdaSent is available.

View on arXiv PDF Code

Similar