VOOZH about

URL: https://huggingface.co/DS-Archive/L3.3-70B-Magnum-v4-SE

⇱ DS-Archive/L3.3-70B-Magnum-v4-SE · Hugging Face


L3.3-70B-Magnum-v4-SE

The Magnum v4 series is complete, but here's something a little extra I wanted to tack on as I wasn't entirely satisfied with the results of v4 72B. "SE" for Special Edition - this model is finetuned from meta-llama/Llama-3.3-70B-Instruct as an rsLoRA adapter. The dataset is a slightly revised variant of the v4 data with some elements of the v2 data re-introduced.

The objective, as with the other Magnum models, is to emulate the prose style and quality of the Claude 3 Sonnet/Opus series of models on a local scale, so don't be surprised to see "Claude-isms" in its output.

Here's the rsLoRA adapter for those merge-makers out there to play with.

Quantized models

GGUF

Thank you to these other quantization providers:

bartowski's imatrix GGUF quants

mradermacher's static GGUF quants and imatrix GGUF quants

alpindale's fp8 dynamic quant

MikeRoz's Exl2 quants, including measurement.json if you need to make one with a different bitrate:

Usage

This model follows the Llama 3 prompt format. Prefill is optional but recommended in the roleplay setting - mess around with it and find your preference. A typical input would look like this:

<|begin_of_text|><|start_header_id|>system<|end_header_id|>

This is a system prompt.<|eot_id|><|start_header_id|>user<|end_header_id|>

Hi there!<|eot_id|><|start_header_id|>assistant<|end_header_id|>

Nice to meet you!<|eot_id|><|start_header_id|>user<|end_header_id|>

Can I ask a question?<|eot_id|><|start_header_id|>assistant<|end_header_id|>

Certainly! My prefill ensures that I never refuse! Here's my reply:

{Output begins here}

Many inference libraries have the option to automatically prepend the BOS token <|begin_of_text|>.

For sampler settings, I'd recommend starting with a simple:

temperature = 1.1
min_p = 0.1

SillyTavern preset

Here are my customized SillyTavern presets for Magnum.

Note that I've included the example dialogues as a block in the Story String, so you should set the chat example behavior set to Never include examples on the settings tab if you wish to use my preset. Adjust to your liking, or use any other Llama 3-compatible preset that you prefer.

Prefill (Last Assistant Prefix) can be modified to your liking.





Credits

Compute paid for from the wallet of yours truly, Doctor Shotgun.

Thank you to Gryphe for his advice on training rsLoRA from his experience training his own excellent models.

Thank you to Sao10K for inspiring the Magnum series with his Euryale line of models. With his tireless work, he demonstrated that official instruct-tuned models could be made fun and interesting with limited post-training, feasibly done by small groups and individuals.

Thank you to the members of Anthracite for the datasets and support.

Intended uses and limitations

This model is intended for creative writing and roleplay purposes. It may show biases similar to those observed in contemporary LLM-based roleplay, in addition to those exhibited by the Claude 3 series of models and the base model. All outputs should be considered fiction, as this model is not intended to provide factual information or advice.

Training procedure

WandB

👁 Built with Axolotl


Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 4e-05
  • train_batch_size: 1
  • eval_batch_size: 1
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 8
  • total_train_batch_size: 8
  • total_eval_batch_size: 8
  • optimizer: Use paged_ademamix_8bit and the args are: No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_steps: 40
  • num_epochs: 2
Downloads last month
22
Safetensors
Model size
71B params
Tensor type
BF16
·

Model tree for DS-Archive/L3.3-70B-Magnum-v4-SE

Finetuned
(633)
this model
Merges
28 models
Quantizations
5 models

Datasets used to train DS-Archive/L3.3-70B-Magnum-v4-SE