VOOZH about

URL: https://huggingface.co/dphn/dolphin-2.9.4-llama3.1-8b-gguf

โ‡ฑ dphn/dolphin-2.9.4-llama3.1-8b-gguf ยท Hugging Face


Dolphin 2.9.4 Llama 3.1 8b ๐Ÿฌ

This is the GGUF conversion, for use with llama.cpp, ollama, lmstudio etc.

Curated and trained by Eric Hartford and Cognitive Computations

๐Ÿ‘ Discord
Discord: https://discord.gg/h3K4XGj2RH

๐Ÿ‘ Image

Our appreciation for the sponsors of Dolphin 2.9.4:

This model is based on Meta Llama 3.1 8b, and is governed by the Llama 3.1 license.

The base model has 128K context, and our finetuning used 8192 sequence length.

Dolphin 2.9.4 uses ChatML prompt template format.

example:

<|im_start|>system
You are Dolphin, a helpful AI assistant.<|im_end|>
<|im_start|>user
{prompt}<|im_end|>
<|im_start|>assistant

Dolphin-2.9.4 has a variety of instruction following, conversational, and coding skills. It also has agentic abilities and supports function calling. It is especially trained to obey the system prompt, and follow instructions in many languages.

Dolphin is uncensored. We have filtered the dataset to remove alignment and bias. This makes the model more compliant. You are advised to implement your own alignment layer before exposing the model as a service. It will be highly compliant with any requests, even unethical ones. Please read my blog post about uncensored models. https://erichartford.com/uncensored-models You are responsible for any content you create using this model. Enjoy responsibly.

๐Ÿ‘ Built with Axolotl


workspace/axolotl/dolphin-2.9.4-llama3.1-8b

This model is a fine-tuned version of meta-llama/Meta-Llama-3.1-8B on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.5655

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-06
  • train_batch_size: 2
  • eval_batch_size: 2
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 8
  • gradient_accumulation_steps: 16
  • total_train_batch_size: 256
  • total_eval_batch_size: 16
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_steps: 100
  • num_epochs: 3

Training results

Training Loss Epoch Step Validation Loss
0.5837 1.0180 1161 0.5814
0.5525 2.0179 2322 0.5671
0.5514 2.9624 3420 0.5655

Framework versions

  • Transformers 4.44.0.dev0
  • Pytorch 2.4.0+cu121
  • Datasets 2.19.1
  • Tokenizers 0.19.1
Downloads last month
2,030
GGUF
Model size
8B params
Architecture
llama
Hardware compatibility
Log In to add your hardware

2-bit

3-bit

4-bit

5-bit

6-bit

8-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for dphn/dolphin-2.9.4-llama3.1-8b-gguf

Quantized
(326)
this model

Datasets used to train dphn/dolphin-2.9.4-llama3.1-8b-gguf