CLAIOct 29, 2023

TeacherLM: Teaching to Fish Rather Than Giving the Fish, Language Modeling Likewise

arXiv:2310.19019v314 citationsh-index: 8Has Code
Originality Incremental advance
AI Analysis

This addresses the challenge of data annotation and learning efficiency for small models in NLP, though it is incremental as it builds on existing LLM capabilities.

The paper tackles the problem of enabling small language models to learn effectively by proposing TeacherLM-7.1B, which annotates NLP samples with fundamentals, chain of thought, and common mistakes, achieving a zero-shot MMLU score of 52.3 and augmenting 58 datasets to improve student models.

Large Language Models (LLMs) exhibit impressive reasoning and data augmentation capabilities in various NLP tasks. However, what about small models? In this work, we propose TeacherLM-7.1B, capable of annotating relevant fundamentals, chain of thought, and common mistakes for most NLP samples, which makes annotation more than just an answer, thus allowing other models to learn "why" instead of just "what". The TeacherLM-7.1B model achieved a zero-shot score of 52.3 on MMLU, surpassing most models with over 100B parameters. Even more remarkable is its data augmentation ability. Based on TeacherLM-7.1B, we augmented 58 NLP datasets and taught various student models with different parameters from OPT and BLOOM series in a multi-task setting. The experimental results indicate that the data augmentation provided by TeacherLM has brought significant benefits. We will release the TeacherLM series of models and augmented datasets as open-source.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes