WARNING: The checkpoints on this repo corresponds to Megatron-Deespeed checkpoints. Use them together with our fork of Megatron-DeepSpeed. For a normal Hugging Face Transformers checkpoint please go here instead.
BLOOM LM
BigScience Large Open-science Open-access Multilingual Language Model
Model Card
👁 BigScience LogoBigScience Large Open-science Open-access Multilingual Language Model
Model Card
Version 1.0 / 20.July.2022 - Model card copied from bloom-176-intermediate repo- Available intermediary checkpoints - global steps:
- 95000
Request adding more checkpoints by adding an issue on this repo.
Table of Contents
- Model Details
- Uses
- Training Data
- Risks and Limitations
- Evaluation
- Recommendations
- Glossary and Calculations
- More Information
- Model Card Authors
Model Details
BLOOM is a type of language model, which is a probability distribution over sequences of words. Specifically, BLOOM is a Large Language Model (LLM), meaning that it is trained on vast amounts of text data using industrial-scale computational resources. As such, the model is able to capture the statistical tendencies of words, phrases, sentences, and larger spans of text that it is exposed to in the training data.
Basics
This section provides information about the model type, version, license, funders, release date, developers, and contact information. It is useful for anyone who wants to reference the model.
Technical Specifications
This section includes details about the model objective and architecture, and the compute infrastructure. It is useful for people interested in model development.
Training
This section provides information about the training data, the speed and size of training elements, and the environmental impact of training. It is useful for people who want to learn more about the model inputs and training footprint.
Uses
This section addresses questions around how the model is intended to be used, discusses the foreseeable users of the model (including those affected by the model), and describes uses that are considered out of scope or misuse of the model. It is useful for anyone considering using the model or who is affected by the model.
Risks and Limitations
This section identifies foreseeable harms and misunderstandings.
Evaluation
This section describes the evaluation protocols and provides the results.
Recommendations
This section provides information on warnings and potential mitigations.
Glossary and Calculations
This section defines common terms and how metrics are calculated.
More Information
This section provides links to writing on dataset creation, technical specifications, lessons learned, and initial results.
Model Card Authors
Ordered roughly chronologically and by amount of time spent.
Margaret Mitchell, Giada Pistilli, Yacine Jernite, Ezinwanne Ozoani, Marissa Gerchick, Nazneen Rajani, Sasha Luccioni, Irene Solaiman, Maraim Masoud, Somaieh Nikpoor, Carlos Muñoz Ferrandis, Stas Bekman, Christopher Akiki, Danish Contractor, David Lansky, Angelina McMillan-Major, Tristan Thrush, Suzana Ilić, Gérard Dupont, Shayne Longpre, Manan Dey, Stella Biderman, Douwe Kiela, Emi Baylor, Teven Le Scao, Aaron Gokaslan, Julien Launay
