학술논문

Stable LM 2 1.6B Technical Report

Document Type

Working Paper

Author

Bellagente, Marco; Tow, Jonathan; Mahan, Dakota; Phung, Duy; Zhuravinskyi, Maksym; Adithyan, Reshinth; Baicoianu, James; Brooks, Ben; Cooper, Nathan; Datta, Ashish; Lee, Meng; Mostaque, Emad; Pieler, Michael; Pinnaparju, Nikhil; Rocha, Paulo; Saini, Harry; Teufel, Hannah; Zanichelli, Niccolo; Riquelme, Carlos

Source

Subject

Computer Science - Computation and Language
Statistics - Machine Learning

Language

Abstract

We introduce StableLM 2 1.6B, the first in a new generation of our language model series. In this technical report, we present in detail the data and training procedure leading to the base and instruction-tuned versions of StableLM 2 1.6B. The weights for both models are available via Hugging Face for anyone to download and use. The report contains thorough evaluations of these models, including zero- and few-shot benchmarks, multilingual benchmarks, and the MT benchmark focusing on multi-turn dialogues. At the time of publishing this report, StableLM 2 1.6B was the state-of-the-art open model under 2B parameters by a significant margin. Given its appealing small size, we also provide throughput measurements on a number of edge devices. In addition, we open source several quantized checkpoints and provide their performance metrics compared to the original model.
Comment: 23 pages, 6 figures

Online Access

Open Access (Arxiv) Find it@PNU

이메일

부산대학교 도서관

Online Access

메일 발송