VOOZH

URL: https://huggingface.co/Delta-Vector/Rei-V3-KTO-12B

⇱ Delta-Vector/Rei-V3-KTO-12B · Hugging Face

Rei-12B

Another prototype Magnum... (This time with RL!)

✨ Overview

Taking the previous 12B trained with Subseqence Loss - This model is meant to refine the base's sharp edges and increase coherency, intelligence and prose while replicating the prose of the Claude models Opus and Sonnet

Fine-tuned on top of Rei-V3-12B-Base, Rei-12B is designed to replicate the prose quality of Claude 3 models, particularly Sonnet and Opus, using a prototype Magnum V5 datamix.

📥 Quantized Models

💬 Prompt Format

Rei-12B uses the ChatML format. A typical conversation should be structured as:

<|im_start|>user
Hi there!<|im_end|>
<|im_start|>assistant
Nice to meet you!<|im_end|>
<|im_start|>user
Can I ask a question?<|im_end|>
<|im_start|>assistant

Recommended System Prompt

⚙️ Training

Hparams

For Hparams for this model we used a grad clip of 1e-4 as it was proven to the best value for Mistral-12B based models, and also to prevent Rewards/Chosen from flat-lining as Hermes-genned data is... The biggest piece of dogshit.

Configuration

The model was trained for 1 epochs on 8x NVIDIA H100s GPUs generously provided by @Kalomaze

👁 Built with Axolotl

⚠️ Credits

I'd like to thank, Ruka/Sama twinkman | LucyKnada | Kubernetes Bad | PocketDoc | Tav | Trappu | Alicat | And the rest of Anthracite/Pygmalion for testing, feedback, and support.

Rei-12B | KTO

Downloads last month: 65

Safetensors

Model size

12B params

Tensor type

BF16

·

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Delta-Vector/Rei-V3-KTO-12B

Base model

mistralai/Mistral-Nemo-Base-2407

Finetuned

NewEden/MistralAI-Nemo-Instruct-ChatML

Finetuned

Delta-Vector/Rei-12B-V3-Base

Finetuned

(2)

this model

Merges

Quantizations

Datasets used to train Delta-Vector/Rei-V3-KTO-12B

Collections including Delta-Vector/Rei-V3-KTO-12B

A small preview of what might become the first(or second?) stepping stone for Magnum v5 • 6 items • Updated Mar 6 • 4

Models that i would personally recommend for ACR tasks: • 10 items • Updated Sep 28, 2025 • 4