CL AI LGJun 20, 2023

Textbooks Are All You Need

Suriya Gunasekar, Yi Zhang, Jyoti Aneja, Caio César Teodoro Mendes, Allie Del Giorno, Sivakanth Gopi, Mojan Javaheripi, Piero Kauffmann, Gustavo de Rosa, Olli Saarikivi, Adil Salim, Shital Shah

Microsoft

arXiv:2306.11644v235.2610 citationsh-index: 50

Originality Incremental advance

AI Analysis

This addresses the need for more computationally efficient code generation models, though it is incremental in scaling down existing methods.

The authors tackled the problem of creating efficient large language models for code by introducing phi-1, a 1.3B parameter model trained on high-quality data, achieving pass@1 accuracies of 50.6% on HumanEval and 55.5% on MBPP.

We introduce phi-1, a new large language model for code, with significantly smaller size than competing models: phi-1 is a Transformer-based model with 1.3B parameters, trained for 4 days on 8 A100s, using a selection of ``textbook quality" data from the web (6B tokens) and synthetically generated textbooks and exercises with GPT-3.5 (1B tokens). Despite this small scale, phi-1 attains pass@1 accuracy 50.6% on HumanEval and 55.5% on MBPP. It also displays surprising emergent properties compared to phi-1-base, our model before our finetuning stage on a dataset of coding exercises, and phi-1-small, a smaller model with 350M parameters trained with the same pipeline as phi-1 that still achieves 45% on HumanEval.

View on arXiv PDF

Similar