CLAIJan 4, 2024

TinyLlama: An Open-Source Small Language Model

arXiv:2401.02385v2746 citationsh-index: 8Has Code
Originality Incremental advance
AI Analysis

This provides an efficient, high-performing small language model for resource-constrained applications, though it is incremental as it builds on existing architectures and community advances.

The authors tackled the problem of creating a compact language model by developing TinyLlama, a 1.1B parameter model pretrained on 1 trillion tokens, which significantly outperforms existing open-source models of similar size in downstream tasks.

We present TinyLlama, a compact 1.1B language model pretrained on around 1 trillion tokens for approximately 3 epochs. Building on the architecture and tokenizer of Llama 2, TinyLlama leverages various advances contributed by the open-source community (e.g., FlashAttention and Lit-GPT), achieving better computational efficiency. Despite its relatively small size, TinyLlama demonstrates remarkable performance in a series of downstream tasks. It significantly outperforms existing open-source language models with comparable sizes. Our model checkpoints and code are publicly available on GitHub at https://github.com/jzhang38/TinyLlama.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes