CLJan 31, 2023

Differentiable Entailment for Parameter Efficient Few Shot Learning

arXiv:2301.13345v1h-index: 3

AI Analysis

This work addresses the challenge of practical deployment for few-shot learning models by reducing computational costs, though it is incremental as it builds on existing techniques.

The paper tackles the problem of parameter inefficiency in few-shot learning by proposing a method that combines intermediate training as entailment tasks and differentiable optimization of tokens, achieving competitive performance while optimizing only 3% of model parameters.

Few-shot learning allows pre-trained language models to adapt to downstream tasks while using a limited number of training examples. However, practical applications are limited when all model parameters must be optimized. In this work we apply a new technique for parameter efficient few shot learning while adopting a strict definition of parameter efficiency. Our training method combines 1) intermediate training by reformulating natural language tasks as entailment tasks \cite{wang_entailment_2021} and 2) differentiable optimization of template and label tokens \cite{zhang_differentiable_2021}. We quantify the tradeoff between parameter efficiency and performance in the few-shot regime and propose a simple model agnostic approach that can be extended to any task By achieving competitive performance while only optimizing 3\% of a model's parameters and allowing for batched inference, we allow for more efficient practical deployment of models.

View on arXiv PDF

Similar