VOOZH about

URL: https://huggingface.co/bigscience/bloom-optimizer-states

⇱ bigscience/bloom-optimizer-states · Hugging Face


WARNING: The checkpoints on this repo corresponds to Megatron-Deespeed checkpoints. Use them together with our fork of Megatron-DeepSpeed. For a normal Hugging Face Transformers checkpoint please go here instead.

BLOOM LM
BigScience Large Open-science Open-access Multilingual Language Model
Model Card

👁 BigScience Logo

Version 1.0 / 20.July.2022 - Model card copied from bloom-176-intermediate repo- Available intermediary checkpoints - global steps:

  • 95000

Request adding more checkpoints by adding an issue on this repo.

Table of Contents

  1. Model Details
  2. Uses
  3. Training Data
  4. Risks and Limitations
  5. Evaluation
  6. Recommendations
  7. Glossary and Calculations
  8. More Information
  9. Model Card Authors

Model Details

BLOOM is a type of language model, which is a probability distribution over sequences of words. Specifically, BLOOM is a Large Language Model (LLM), meaning that it is trained on vast amounts of text data using industrial-scale computational resources. As such, the model is able to capture the statistical tendencies of words, phrases, sentences, and larger spans of text that it is exposed to in the training data.

Basics

This section provides information about the model type, version, license, funders, release date, developers, and contact information. It is useful for anyone who wants to reference the model.

Technical Specifications

This section includes details about the model objective and architecture, and the compute infrastructure. It is useful for people interested in model development.


Training

This section provides information about the training data, the speed and size of training elements, and the environmental impact of training. It is useful for people who want to learn more about the model inputs and training footprint.


Uses

This section addresses questions around how the model is intended to be used, discusses the foreseeable users of the model (including those affected by the model), and describes uses that are considered out of scope or misuse of the model. It is useful for anyone considering using the model or who is affected by the model.


Risks and Limitations

This section identifies foreseeable harms and misunderstandings.


Evaluation

This section describes the evaluation protocols and provides the results.


Recommendations

This section provides information on warnings and potential mitigations.


Glossary and Calculations

This section defines common terms and how metrics are calculated.


More Information

This section provides links to writing on dataset creation, technical specifications, lessons learned, and initial results.


Model Card Authors

Ordered roughly chronologically and by amount of time spent.

Margaret Mitchell, Giada Pistilli, Yacine Jernite, Ezinwanne Ozoani, Marissa Gerchick, Nazneen Rajani, Sasha Luccioni, Irene Solaiman, Maraim Masoud, Somaieh Nikpoor, Carlos Muñoz Ferrandis, Stas Bekman, Christopher Akiki, Danish Contractor, David Lansky, Angelina McMillan-Major, Tristan Thrush, Suzana Ilić, Gérard Dupont, Shayne Longpre, Manan Dey, Stella Biderman, Douwe Kiela, Emi Baylor, Teven Le Scao, Aaron Gokaslan, Julien Launay

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Papers for bigscience/bloom-optimizer-states