CLFeb 27, 2023

LLaMA: Open and Efficient Foundation Language Models

arXiv:2302.13971v120224 citationsh-index: 71
Originality Highly original
AI Analysis

This work provides open, state-of-the-art language models to the research community, addressing the problem of reliance on proprietary datasets in AI development.

The authors introduced LLaMA, a collection of open and efficient foundation language models ranging from 7B to 65B parameters, trained on trillions of tokens using only publicly available datasets, with LLaMA-13B outperforming GPT-3 (175B) on most benchmarks and LLaMA-65B being competitive with Chinchilla-70B and PaLM-540B.

We introduce LLaMA, a collection of foundation language models ranging from 7B to 65B parameters. We train our models on trillions of tokens, and show that it is possible to train state-of-the-art models using publicly available datasets exclusively, without resorting to proprietary and inaccessible datasets. In particular, LLaMA-13B outperforms GPT-3 (175B) on most benchmarks, and LLaMA-65B is competitive with the best models, Chinchilla-70B and PaLM-540B. We release all our models to the research community.

Code Implementations57 repos

Data from Papers with Code (CC-BY-SA-4.0)

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes