This repo contains GGUF quants of the model. If you need the original weights, please find them here.
This is a series of models designed to replicate the prose quality of the Claude 3 models, specifically Sonnet and Opus.
experimental because trained on top of instruct; but turned out amazing; hence code named magnum-alter, the original model that kickstarted the v4 family
This model is fine-tuned on top of Qwen2.5-72B-Instruct.
Prompting
A typical input would look like this:
<|im_start|>system
system prompt<|im_end|>
<|im_start|>user
Hi there!<|im_end|>
<|im_start|>assistant
Nice to meet you!<|im_end|>
<|im_start|>user
Can I ask a question?<|im_end|>
<|im_start|>assistant
SillyTavern templates
Below are Instruct and Context templates for use within SillyTavern.
Axolotl config
Credits
We'd like to thank DoctorShotgun for sponsoring the compute for this train. We would also like to thank all members of Anthracite who made this finetune possible.
Datasets
- anthracite-org/c2_logs_32k_llama3_qwen2_v1.2
- anthracite-org/kalo-opus-instruct-22k-no-refusal
- lodrick-the-lafted/kalo-opus-instruct-3k-filtered
- anthracite-org/nopm_claude_writing_fixed
- anthracite-org/kalo_opus_misc_240827
- anthracite-org/kalo_misc_part2
Training
We used 8x mi300x GPUs graciously provided by DoctorShotgun for the full-parameter fine-tuning of the model.
Safety
...
- Downloads last month
- 1,014
GGUF
Model size
73B params
Architecture
qwen2
Hardware compatibility
Log In to add your hardware
3-bit
4-bit
5-bit
6-bit
8-bit
Datasets used to train anthracite-org/magnum-v4-72b-gguf
Collection including anthracite-org/magnum-v4-72b-gguf
18 items โข Updated โข 34
