CLAIJun 5, 2024

Xmodel-LM Technical Report

arXiv:2406.02856v51 citationsHas Code
Originality Synthesis-oriented
AI Analysis

This provides an efficient language model for users needing high performance with limited resources, though it appears incremental as it builds on existing scaling approaches.

The paper tackles the challenge of creating a compact language model by introducing Xmodel-LM, a 1.1B parameter model pre-trained on 2 trillion tokens, which outperforms similar-scale open-source models.

We introduce Xmodel-LM, a compact and efficient 1.1B language model pre-trained on around 2 trillion tokens. Trained on our self-built dataset (Xdata), which balances Chinese and English corpora based on downstream task optimization, Xmodel-LM exhibits remarkable performance despite its smaller size. It notably surpasses existing open-source language models of similar scale. Our model checkpoints and code are publicly accessible on GitHub at https://github.com/XiaoduoAILab/XmodelLM.

Code Implementations3 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes