WARNING: The checkpoints on this repo are not fully trained model. Evaluations of intermediary checkpoints and the final model will be added when conducted (see below).
BLOOM LM
BigScience Large Open-science Open-access Multilingual Language Model
Model Card
👁 BigScience LogoBigScience Large Open-science Open-access Multilingual Language Model
Model Card
Version 1.3 / 11.July.2022 - Available intermediary checkpoints - global steps:
1000,10000,100000,200000,300000,400000,500000,600000
You can check the available checkpoints by clicking on the branches section of the repo
How to load a specific version
We use git tags to load a model in a specific version (eg. global_step1000):
from transformers import AutoModelForCausalLM
model = AutoModelForCausalLM.from_pretrained(
"bigscience/bloom-760m-intermediate",
revision="global_step1000",
torch_dtype="auto",
)
Table of Contents
- Model Details
- Uses
- Training Data
- Risks and Limitations
- Evaluation
- Recommendations
- Glossary and Calculations
- More Information
- Model Card Authors
Model Details
BLOOM is a type of language model, which is a probability distribution over sequences of words. Specifically, BLOOM is a Large Language Model (LLM), meaning that it is trained on vast amounts of text data using industrial-scale computational resources. As such, the model is able to capture the statistical tendencies of words, phrases, sentences, and larger spans of text that it is exposed to in the training data.
Basics
This section provides information about the model type, version, license, funders, release date, developers, and contact information. It is useful for anyone who wants to reference the model.
Technical Specifications
This section includes details about the model objective and architecture, and the compute infrastructure. It is useful for people interested in model development.
Training
This section provides information about the training data, the speed and size of training elements, and the environmental impact of training. It is useful for people who want to learn more about the model inputs and training footprint.
Uses
This section addresses questions around how the model is intended to be used, discusses the foreseeable users of the model (including those affected by the model), and describes uses that are considered out of scope or misuse of the model. It is useful for anyone considering using the model or who is affected by the model.
Risks and Limitations
This section identifies foreseeable harms and misunderstandings.
Evaluation
This section describes the evaluation protocols and provides the results.
Recommendations
This section provides information on warnings and potential mitigations.
Glossary and Calculations
This section defines common terms and how metrics are calculated.
More Information
This section provides links to writing on dataset creation, technical specifications, lessons learned, and initial results.
Model Card Authors
Ordered roughly chronologically and by amount of time spent.
Margaret Mitchell, Giada Pistilli, Yacine Jernite, Ezinwanne Ozoani, Marissa Gerchick, Nazneen Rajani, Sasha Luccioni, Irene Solaiman, Maraim Masoud, Somaieh Nikpoor, Carlos Muñoz Ferrandis, Stas Bekman, Christopher Akiki, Danish Contractor, David Lansky, Angelina McMillan-Major, Tristan Thrush, Suzana Ilić, Gérard Dupont, Shayne Longpre, Manan Dey, Stella Biderman, Douwe Kiela, Emi Baylor, Teven Le Scao, Aaron Gokaslan, Julien Launay, Niklas Muennighoff
- Downloads last month
- 1,016
