CL AIApr 19, 2022

What Makes Instruction Learning Hard? An Investigation and a New Challenge in a Synthetic Environment

Matthew Finlayson, Kyle Richardson, Ashish Sabharwal, Peter Clark

AI2

arXiv:2204.09148v224.3296 citationsh-index: 64Has Code

Originality Incremental advance

AI Analysis

This work addresses the problem of understanding and improving instruction learning capabilities in large transformer models for AI researchers, though it is incremental as it builds on existing synthetic environments.

The paper investigates why instruction learning is difficult by using a synthetic environment where a model must match strings to regular expressions, finding that models struggle with large regular languages and long-context tracking. The authors create a challenging dataset, Hard RegSet, on which a fine-tuned T5 model achieves only 65.6% accuracy on test instructions and 11%-24% in out-of-distribution settings.

The instruction learning paradigm -- where a model learns to perform new tasks from task descriptions alone -- has become popular in general-purpose model research. The capabilities of large transformer models as instruction learners, however, remain poorly understood. We use a controlled synthetic environment to characterize such capabilities. Specifically, we use the task of deciding whether a given string matches a regular expression (viewed as an instruction) to identify properties of tasks, instructions, and instances that make instruction learning challenging. For instance, we find that our model, a fine-tuned T5-based text2text transformer, struggles with large regular languages, suggesting that less precise instructions are challenging for models. Additionally, instruction executions that require tracking longer contexts of prior steps are also more difficult. We use our findings to systematically construct a challenging instruction learning dataset, which we call Hard RegSet. Fine-tuning on Hard RegSet, our large transformer learns to correctly interpret only 65.6% of test instructions (with at least 90% accuracy), and 11%-24% of the instructions in out-of-distribution generalization settings. We propose Hard RegSet as a challenging instruction learning task, and a controlled environment for studying instruction learning.

View on arXiv PDF Code

Similar