CLLGMay 30, 2022

Prompting ELECTRA: Few-Shot Learning with Discriminative Pre-Trained Models

Princeton
arXiv:2205.15223v3295 citationsh-index: 55
Originality Incremental advance
AI Analysis

This work addresses the challenge of enabling efficient few-shot learning for discriminative models, offering a method that improves performance without adding parameters, which is incremental but practical for NLP applications.

The paper tackled the problem of adapting prompt-based few-shot learning to discriminative pre-trained models like ELECTRA, which traditionally don't fit the text infilling paradigm used by masked language models, and showed that ELECTRA outperforms masked language models across a wide range of tasks.

Pre-trained masked language models successfully perform few-shot learning by formulating downstream tasks as text infilling. However, as a strong alternative in full-shot settings, discriminative pre-trained models like ELECTRA do not fit into the paradigm. In this work, we adapt prompt-based few-shot learning to ELECTRA and show that it outperforms masked language models in a wide range of tasks. ELECTRA is pre-trained to distinguish if a token is generated or original. We naturally extend that to prompt-based few-shot learning by training to score the originality of the target options without introducing new parameters. Our method can be easily adapted to tasks involving multi-token predictions without extra computation overhead. Analysis shows that ELECTRA learns distributions that align better with downstream tasks.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes