CLJan 31, 2023

Differentiable Entailment for Parameter Efficient Few Shot Learning

arXiv:2301.13345v1h-index: 3
AI Analysis

This work addresses the challenge of practical deployment for few-shot learning models by reducing computational costs, though it is incremental as it builds on existing techniques.

The paper tackles the problem of parameter inefficiency in few-shot learning by proposing a method that combines intermediate training as entailment tasks and differentiable optimization of tokens, achieving competitive performance while optimizing only 3% of model parameters.

Few-shot learning allows pre-trained language models to adapt to downstream tasks while using a limited number of training examples. However, practical applications are limited when all model parameters must be optimized. In this work we apply a new technique for parameter efficient few shot learning while adopting a strict definition of parameter efficiency. Our training method combines 1) intermediate training by reformulating natural language tasks as entailment tasks \cite{wang_entailment_2021} and 2) differentiable optimization of template and label tokens \cite{zhang_differentiable_2021}. We quantify the tradeoff between parameter efficiency and performance in the few-shot regime and propose a simple model agnostic approach that can be extended to any task By achieving competitive performance while only optimizing 3\% of a model's parameters and allowing for batched inference, we allow for more efficient practical deployment of models.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes