CL MLFeb 27, 2024

Stable LM 2 1.6B Technical Report

Marco Bellagente, Jonathan Tow, Dakota Mahan, Duy Phung, Maksym Zhuravinskyi, Reshinth Adithyan, James Baicoianu, Ben Brooks, Nathan Cooper, Ashish Datta, Meng Lee, Emad Mostaque

arXiv:2402.17834v122.382 citationsh-index: 10Has Code

Originality Synthesis-oriented

AI Analysis

This provides an incremental improvement in small-scale open language models for researchers and developers.

The authors introduced StableLM 2 1.6B, a new language model series, detailing its data, training, and evaluations, and reported it as the state-of-the-art open model under 2B parameters at publication time.

We introduce StableLM 2 1.6B, the first in a new generation of our language model series. In this technical report, we present in detail the data and training procedure leading to the base and instruction-tuned versions of StableLM 2 1.6B. The weights for both models are available via Hugging Face for anyone to download and use. The report contains thorough evaluations of these models, including zero- and few-shot benchmarks, multilingual benchmarks, and the MT benchmark focusing on multi-turn dialogues. At the time of publishing this report, StableLM 2 1.6B was the state-of-the-art open model under 2B parameters by a significant margin. Given its appealing small size, we also provide throughput measurements on a number of edge devices. In addition, we open source several quantized checkpoints and provide their performance metrics compared to the original model.

View on arXiv PDF

Similar