Stable LM 2 1.6B Technical Report
Title: | Stable LM 2 1.6B Technical Report |
---|---|
Authors: | Bellagente, Marco, Tow, Jonathan, Mahan, Dakota, Phung, Duy, Zhuravinskyi, Maksym, Adithyan, Reshinth, Baicoianu, James, Brooks, Ben, Cooper, Nathan, Datta, Ashish, Lee, Meng, Mostaque, Emad, Pieler, Michael, Pinnaparju, Nikhil, Rocha, Paulo, Saini, Harry, Teufel, Hannah, Zanichelli, Niccolo, Riquelme, Carlos |
Publication Year: | 2024 |
Collection: | Computer Science Statistics |
Subject Terms: | Computer Science - Computation and Language, Statistics - Machine Learning |
More Details: | We introduce StableLM 2 1.6B, the first in a new generation of our language model series. In this technical report, we present in detail the data and training procedure leading to the base and instruction-tuned versions of StableLM 2 1.6B. The weights for both models are available via Hugging Face for anyone to download and use. The report contains thorough evaluations of these models, including zero- and few-shot benchmarks, multilingual benchmarks, and the MT benchmark focusing on multi-turn dialogues. At the time of publishing this report, StableLM 2 1.6B was the state-of-the-art open model under 2B parameters by a significant margin. Given its appealing small size, we also provide throughput measurements on a number of edge devices. In addition, we open source several quantized checkpoints and provide their performance metrics compared to the original model. Comment: 23 pages, 6 figures |
Document Type: | Working Paper |
Access URL: | http://arxiv.org/abs/2402.17834 |
Accession Number: | edsarx.2402.17834 |
Database: | arXiv |
Description not available. |