VOOZH about

URL: https://huggingface.co/Felladrin/Minueza-2-96M-Instruct-Variant-07

⇱ Felladrin/Minueza-2-96M-Instruct-Variant-07 · Hugging Face


Minueza-2-96M-Instruct (Variant 07)

This model is a fine-tuned version of Felladrin/Minueza-2-96M on the English Open-Orca/slimorca-deduped-cleaned-corrected dataset.

Usage

pip install transformers==4.51.1 torch==2.6.0
from transformers import pipeline, TextStreamer
import torch

generate_text = pipeline(
 "text-generation",
 model="Felladrin/Minueza-2-96M-Instruct-Variant-07",
 device=torch.device("cuda" if torch.cuda.is_available() else "cpu"),
)

messages = [
 {
 "role": "system",
 "content": "You are an AI assistant that follows instruction extremely well. Help as much as you can.",
 },
 {
 "role": "user",
 "content": "Could you explain how does the Internet work?",
 },
]

generate_text(
 generate_text.tokenizer.apply_chat_template(
 messages, tokenize=False, add_generation_prompt=True
 ),
 streamer=TextStreamer(generate_text.tokenizer, skip_special_tokens=True),
 max_new_tokens=512,
 do_sample=True,
 temperature=0.7,
 top_p=0.9,
 top_k=0,
 min_p=0.1,
 repetition_penalty=1.17,
)

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5.8e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • gradient_accumulation_steps: 32
  • total_train_batch_size: 128
  • optimizer: Use adamw_torch with betas=(0.9,0.95) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 3

Framework versions

  • Transformers 4.51.1
  • Pytorch 2.6.0+cu124
  • Datasets 3.5.0
  • Tokenizers 0.21.0

License

This model is licensed under the Apache License 2.0.

Downloads last month
2
Safetensors
Model size
96M params
Tensor type
BF16
·

Model tree for Felladrin/Minueza-2-96M-Instruct-Variant-07

Finetuned
(10)
this model

Dataset used to train Felladrin/Minueza-2-96M-Instruct-Variant-07

Collection including Felladrin/Minueza-2-96M-Instruct-Variant-07