CLAILGApr 17, 2023

LongForm: Effective Instruction Tuning with Reverse Instructions

Meta AI
arXiv:2304.08460v354 citationsh-index: 70Has Code
Originality Incremental advance
AI Analysis

This addresses the problem of expensive instruction data for AI researchers and practitioners, offering a cheaper and cleaner dataset for instruction tuning, though it is incremental in improving existing methods.

The authors tackled the problem of costly and challenging instruction data acquisition for language models by introducing the LongForm-C dataset, created via reverse instructions, which led to models outperforming 10x larger untuned models and prior instruction-tuned models by a large margin.

Instruction tuning enables language models to more effectively generalize and better follow user intent. However, obtaining instruction data is costly and challenging. Prior work employs methods such as expensive human annotation, crowd-sourced datasets with alignment issues, and generating noisy examples via LLMs. We introduce the LongForm-C dataset, which is created by reverse instructions. We generate instructions via LLMs for human-written corpus examples using reverse instructions. First we select a diverse set of human-written documents from corpora such as C4 and Wikipedia; then we generate instructions for these documents via LLMs. This approach provides a cheaper and cleaner instruction-tuning dataset with natural output and one suitable for long text generation. Our models outperform 10x larger language models without instruction tuning on tasks such as story/recipe generation and long-form question answering. Moreover, LongForm models outperform prior instruction-tuned models such as FLAN-T5 and Alpaca by a large margin, and improve language understanding capabilities further. We publicly release our data and models: https://github.com/akoksal/LongForm.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes