CLLGJun 7, 2022

DynaMaR: Dynamic Prompt with Mask Token Representation

arXiv:2206.02982v1284 citationsh-index: 30
Originality Incremental advance
AI Analysis

This incremental improvement addresses prompt engineering challenges for practitioners adapting language models to downstream tasks like e-commerce classification.

The paper tackled the issues of overfitting and manual effort in prompt-based fine-tuning for language models by proposing DynaMaR, which achieved an average improvement of 10% in few-shot settings and 3.7% in data-rich settings over standard fine-tuning on e-commerce applications.

Recent research has shown that large language models pretrained using unsupervised approaches can achieve significant performance improvement on many downstream tasks. Typically when adapting these language models to downstream tasks, like a classification or regression task, we employ a fine-tuning paradigm in which the sentence representation from the language model is input to a task-specific head; the model is then fine-tuned end-to-end. However, with the emergence of models like GPT-3, prompt-based fine-tuning has been proven to be a successful approach for few-shot tasks. Inspired by this work, we study discrete prompt technologies in practice. There are two issues that arise with the standard prompt approach. First, it can overfit on the prompt template. Second, it requires manual effort to formulate the downstream task as a language model problem. In this paper, we propose an improvement to prompt-based fine-tuning that addresses these two issues. We refer to our approach as DynaMaR -- Dynamic Prompt with Mask Token Representation. Results show that DynaMaR can achieve an average improvement of 10% in few-shot settings and improvement of 3.7% in data-rich settings over the standard fine-tuning approach on four e-commerce applications.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes