CLAIJul 27, 2023

Exploiting the Potential of Seq2Seq Models as Robust Few-Shot Learners

arXiv:2307.14856v2h-index: 17
Originality Incremental advance
AI Analysis

This work addresses the challenge of making seq2seq models robust few-shot learners for diverse applications, representing an incremental advance by extending their capabilities beyond tasks like summarization and translation.

The paper tackles the problem of enabling encoder-decoder (seq2seq) models to perform in-context few-shot learning on a broad range of tasks, where they previously lagged behind decoder-only models, and shows that their proposed methods outperform a decoder-only model six times larger and achieve significant improvements over conventional seq2seq models.

In-context learning, which offers substantial advantages over fine-tuning, is predominantly observed in decoder-only models, while encoder-decoder (i.e., seq2seq) models excel in methods that rely on weight updates. Recently, a few studies have demonstrated the feasibility of few-shot learning with seq2seq models; however, this has been limited to tasks that align well with the seq2seq architecture, such as summarization and translation. Inspired by these initial studies, we provide a first-ever extensive experiment comparing the in-context few-shot learning capabilities of decoder-only and encoder-decoder models on a broad range of tasks. Furthermore, we propose two methods to more effectively elicit in-context learning ability in seq2seq models: objective-aligned prompting and a fusion-based approach. Remarkably, our approach outperforms a decoder-only model that is six times larger and exhibits significant performance improvements compared to conventional seq2seq models across a variety of settings. We posit that, with the right configuration and prompt design, seq2seq models can be highly effective few-shot learners for a wide spectrum of applications.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes