CL AIJun 5, 2024

Xmodel-LM Technical Report

Yichuan Wang, Yang Liu, Yu Yan, Qun Wang, Xucheng Huang, Ling Jiang

arXiv:2406.02856v51.91 citationsHas Code

Originality Synthesis-oriented

AI Analysis

This provides an efficient language model for users needing high performance with limited resources, though it appears incremental as it builds on existing scaling approaches.

The paper tackles the challenge of creating a compact language model by introducing Xmodel-LM, a 1.1B parameter model pre-trained on 2 trillion tokens, which outperforms similar-scale open-source models.

We introduce Xmodel-LM, a compact and efficient 1.1B language model pre-trained on around 2 trillion tokens. Trained on our self-built dataset (Xdata), which balances Chinese and English corpora based on downstream task optimization, Xmodel-LM exhibits remarkable performance despite its smaller size. It notably surpasses existing open-source language models of similar scale. Our model checkpoints and code are publicly accessible on GitHub at https://github.com/XiaoduoAILab/XmodelLM.

View on arXiv PDF Code

Similar