VOOZH about

URL: https://huggingface.co/optimum/mistral-1.1b-testing

⇱ optimum/mistral-1.1b-testing · Hugging Face


mistralized tinyllama since flash attention training on llama w/ flash-attn is buggy.

it's based on the 3t base model (not chat tuned).

not extensively tested.

enjoy!

(model card is repeated due to open llm leaderboard length requirements)

mistralized tinyllama since flash attention training on llama w/ flash-attn is buggy.

it's based on the 3t base model (not chat tuned).

not extensively tested.

enjoy!

mistralized tinyllama since flash attention training on llama w/ flash-attn is buggy.

it's based on the 3t base model (not chat tuned).

not extensively tested.

enjoy!

mistralized tinyllama since flash attention training on llama w/ flash-attn is buggy.

it's based on the 3t base model (not chat tuned).

not extensively tested.

enjoy!

mistralized tinyllama since flash attention training on llama w/ flash-attn is buggy.

it's based on the 3t base model (not chat tuned).

not extensively tested.

enjoy!

mistralized tinyllama since flash attention training on llama w/ flash-attn is buggy.

it's based on the 3t base model (not chat tuned).

not extensively tested.

enjoy!

mistralized tinyllama since flash attention training on llama w/ flash-attn is buggy.

it's based on the 3t base model (not chat tuned).

not extensively tested.

enjoy!

mistralized tinyllama since flash attention training on llama w/ flash-attn is buggy.

it's based on the 3t base model (not chat tuned).

not extensively tested.

enjoy!

mistralized tinyllama since flash attention training on llama w/ flash-attn is buggy.

it's based on the 3t base model (not chat tuned).

not extensively tested.

enjoy!

Downloads last month
22
Safetensors
Model size
1B params
Tensor type
F32
·

Model tree for optimum/mistral-1.1b-testing

Finetunes
5 models
Quantizations
1 model