CLApr 10, 2020

Beyond Fine-tuning: Few-Sample Sentence Embedding Transfer

Siddhant Garg, Rohit Kumar Sharma, Yingyu Liang

arXiv:2004.05119v231.0989 citations

Originality Incremental advance

AI Analysis

This addresses a limitation in fine-tuning for few-sample NLP tasks, offering a more efficient alternative.

The paper tackles the problem of improving sentence embedding performance on few-sample tasks by proposing a method that concatenates pre-trained embeddings with those from a simple model trained on target data, showing it outperforms fine-tuning with negligible computational overhead on seven NLP datasets.

Fine-tuning (FT) pre-trained sentence embedding models on small datasets has been shown to have limitations. In this paper we show that concatenating the embeddings from the pre-trained model with those from a simple sentence embedding model trained only on the target data, can improve over the performance of FT for few-sample tasks. To this end, a linear classifier is trained on the combined embeddings, either by freezing the embedding model weights or training the classifier and embedding models end-to-end. We perform evaluation on seven small datasets from NLP tasks and show that our approach with end-to-end training outperforms FT with negligible computational overhead. Further, we also show that sophisticated combination techniques like CCA and KCCA do not work as well in practice as concatenation. We provide theoretical analysis to explain this empirical observation.

View on arXiv PDF

Similar