L3.3-70B-Magnum-v4-SE

The Magnum v4 series is complete, but here's something a little extra I wanted to tack on as I wasn't entirely satisfied with the results of v4 72B. "SE" for Special Edition - this model is finetuned from meta-llama/Llama-3.3-70B-Instruct as an rsLoRA adapter. The dataset is a slightly revised variant of the v4 data with some elements of the v2 data re-introduced.

The objective, as with the other Magnum models, is to emulate the prose style and quality of the Claude 3 Sonnet/Opus series of models on a local scale, so don't be surprised to see "Claude-isms" in its output.

Here's the rsLoRA adapter for those merge-makers out there to play with.

Quantized models

GGUF

Thank you to these other quantization providers:

bartowski's imatrix GGUF quants

mradermacher's static GGUF quants and imatrix GGUF quants

alpindale's fp8 dynamic quant

MikeRoz's Exl2 quants, including measurement.json if you need to make one with a different bitrate:

Usage

This model follows the Llama 3 prompt format. Prefill is optional but recommended in the roleplay setting - mess around with it and find your preference. A typical input would look like this:

<|begin_of_text|><|start_header_id|>system<|end_header_id|>

This is a system prompt.<|eot_id|><|start_header_id|>user<|end_header_id|>

Hi there!<|eot_id|><|start_header_id|>assistant<|end_header_id|>

Nice to meet you!<|eot_id|><|start_header_id|>user<|end_header_id|>

Can I ask a question?<|eot_id|><|start_header_id|>assistant<|end_header_id|>

Certainly! My prefill ensures that I never refuse! Here's my reply:

{Output begins here}

Many inference libraries have the option to automatically prepend the BOS token <|begin_of_text|>.

For sampler settings, I'd recommend starting with a simple:

temperature = 1.1
min_p = 0.1

SillyTavern preset

Here are my customized SillyTavern presets for Magnum.

Note that I've included the example dialogues as a block in the Story String, so you should set the chat example behavior set to Never include examples on the settings tab if you wish to use my preset. Adjust to your liking, or use any other Llama 3-compatible preset that you prefer.

Prefill (Last Assistant Prefix) can be modified to your liking.

Credits

Compute paid for from the wallet of yours truly, Doctor Shotgun.

Thank you to Gryphe for his advice on training rsLoRA from his experience training his own excellent models.

Thank you to Sao10K for inspiring the Magnum series with his Euryale line of models. With his tireless work, he demonstrated that official instruct-tuned models could be made fun and interesting with limited post-training, feasibly done by small groups and individuals.

Thank you to the members of Anthracite for the datasets and support.

Intended uses and limitations

This model is intended for creative writing and roleplay purposes. It may show biases similar to those observed in contemporary LLM-based roleplay, in addition to those exhibited by the Claude 3 series of models and the base model. All outputs should be considered fiction, as this model is not intended to provide factual information or advice.

Training procedure

WandB

👁 Built with Axolotl

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 4e-05
train_batch_size: 1
eval_batch_size: 1
seed: 42
distributed_type: multi-GPU
num_devices: 8
total_train_batch_size: 8
total_eval_batch_size: 8
optimizer: Use paged_ademamix_8bit and the args are: No additional optimizer arguments
lr_scheduler_type: cosine
lr_scheduler_warmup_steps: 40
num_epochs: 2

Downloads last month: 22

Safetensors

Model size

71B params

Tensor type

BF16

Model tree for DS-Archive/L3.3-70B-Magnum-v4-SE

Base model

meta-llama/Llama-3.1-70B

Finetuned

meta-llama/Llama-3.3-70B-Instruct

Finetuned

(633)

this model

Merges

28 models

Quantizations

5 models

URL: https://huggingface.co/DS-Archive/L3.3-70B-Magnum-v4-SE

⇱ DS-Archive/L3.3-70B-Magnum-v4-SE · Hugging Face