Learning from Task Descriptions
This addresses the challenge for NLP researchers to develop models that generalize from task descriptions without extensive training data, though it is incremental as it synthesizes prior work into a new framework and dataset.
The paper tackles the problem of enabling NLP systems to solve new tasks from descriptions alone, like humans, by introducing the ZEST dataset for task-oriented evaluation on unseen tasks. The result shows that the state-of-the-art T5 model only achieves a 12% score on ZEST, highlighting a significant performance gap.
Typically, machine learning systems solve new tasks by training on thousands of examples. In contrast, humans can solve new tasks by reading some instructions, with perhaps an example or two. To take a step toward closing this gap, we introduce a framework for developing NLP systems that solve new tasks after reading their descriptions, synthesizing prior work in this area. We instantiate this framework with a new English language dataset, ZEST, structured for task-oriented evaluation on unseen tasks. Formulating task descriptions as questions, we ensure each is general enough to apply to many possible inputs, thus comprehensively evaluating a model's ability to solve each task. Moreover, the dataset's structure tests specific types of systematic generalization. We find that the state-of-the-art T5 model achieves a score of 12% on ZEST, leaving a significant challenge for NLP researchers.