CLLGJan 2, 2021

Learning to Generate Task-Specific Adapters from Task Description

arXiv:2101.00420v2729 citations
AI Analysis

This work addresses the problem of improving generalization to unseen NLP tasks for text-to-text transformers, which is relevant for researchers and practitioners working with these models.

This paper introduces Hypter, a framework that generates task-specific adapters from task descriptions to improve the generalization of text-to-text transformers to unseen tasks. Hypter achieved an 11.3% comparative improvement on the ZEST dataset when using BART-Large.

Pre-trained text-to-text transformers such as BART have achieved impressive performance across a range of NLP tasks. Recent study further shows that they can learn to generalize to novel tasks, by including task descriptions as part of the source sequence and training the model with (source, target) examples. At test time, these fine-tuned models can make inferences on new tasks using the new task descriptions as part of the input. However, this approach has potential limitations, as the model learns to solve individual (source, target) examples (i.e., at the instance level), instead of learning to solve tasks by taking all examples within a task as a whole (i.e., at the task level). To this end, we introduce Hypter, a framework that improves text-to-text transformer's generalization ability to unseen tasks by training a hypernetwork to generate task-specific, light-weight adapters from task descriptions. Experiments on ZEST dataset and a synthetic SQuAD dataset demonstrate that Hypter improves upon fine-tuning baselines. Notably, when using BART-Large as the main network, Hypter brings 11.3% comparative improvement on ZEST dataset.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes