CLFeb 27, 2023

LLaMA: Open and Efficient Foundation Language Models

Hugo Touvron, Thibaut Lavril, Gautier Izacard, Xavier Martinet, Marie-Anne Lachaux, Timothée Lacroix, Baptiste Rozière, Naman Goyal, Eric Hambro, Faisal Azhar, Aurelien Rodriguez, Armand Joulin

arXiv:2302.13971v170.720966 citationsh-index: 71Has Code

Originality Highly original

AI Analysis

This work provides open, state-of-the-art language models to the research community, addressing the problem of reliance on proprietary datasets in AI development.

The authors introduced LLaMA, a collection of open and efficient foundation language models ranging from 7B to 65B parameters, trained on trillions of tokens using only publicly available datasets, with LLaMA-13B outperforming GPT-3 (175B) on most benchmarks and LLaMA-65B being competitive with Chinchilla-70B and PaLM-540B.

We introduce LLaMA, a collection of foundation language models ranging from 7B to 65B parameters. We train our models on trillions of tokens, and show that it is possible to train state-of-the-art models using publicly available datasets exclusively, without resorting to proprietary and inaccessible datasets. In particular, LLaMA-13B outperforms GPT-3 (175B) on most benchmarks, and LLaMA-65B is competitive with the best models, Chinchilla-70B and PaLM-540B. We release all our models to the research community.

View on arXiv PDF Code

Similar