LLaMA: Open and Efficient Foundation Language Models
This work provides open, state-of-the-art language models to the research community, addressing the problem of reliance on proprietary datasets in AI development.
The authors introduced LLaMA, a collection of open and efficient foundation language models ranging from 7B to 65B parameters, trained on trillions of tokens using only publicly available datasets, with LLaMA-13B outperforming GPT-3 (175B) on most benchmarks and LLaMA-65B being competitive with Chinchilla-70B and PaLM-540B.
We introduce LLaMA, a collection of foundation language models ranging from 7B to 65B parameters. We train our models on trillions of tokens, and show that it is possible to train state-of-the-art models using publicly available datasets exclusively, without resorting to proprietary and inaccessible datasets. In particular, LLaMA-13B outperforms GPT-3 (175B) on most benchmarks, and LLaMA-65B is competitive with the best models, Chinchilla-70B and PaLM-540B. We release all our models to the research community.