VOOZH about

URL: https://huggingface.co/Delta-Vector/Rei-V3-KTO-12B

โ‡ฑ Delta-Vector/Rei-V3-KTO-12B ยท Hugging Face


Rei-12B

Another prototype Magnum... (This time with RL!)

๐Ÿ‘ Rei Model

โœจ Overview

Taking the previous 12B trained with Subseqence Loss - This model is meant to refine the base's sharp edges and increase coherency, intelligence and prose while replicating the prose of the Claude models Opus and Sonnet

Fine-tuned on top of Rei-V3-12B-Base, Rei-12B is designed to replicate the prose quality of Claude 3 models, particularly Sonnet and Opus, using a prototype Magnum V5 datamix.

๐Ÿ“ฅ Quantized Models

๐Ÿ’ฌ Prompt Format

Rei-12B uses the ChatML format. A typical conversation should be structured as:

<|im_start|>user
Hi there!<|im_end|>
<|im_start|>assistant
Nice to meet you!<|im_end|>
<|im_start|>user
Can I ask a question?<|im_end|>
<|im_start|>assistant

Recommended System Prompt

โš™๏ธ Training

Hparams

  • For Hparams for this model we used a grad clip of 1e-4 as it was proven to the best value for Mistral-12B based models, and also to prevent Rewards/Chosen from flat-lining as Hermes-genned data is... The biggest piece of dogshit.
  • ๐Ÿ‘ Image

Configuration

The model was trained for 1 epochs on 8x NVIDIA H100s GPUs generously provided by @Kalomaze

โš ๏ธ Credits

I'd like to thank, Ruka/Sama twinkman | LucyKnada | Kubernetes Bad | PocketDoc | Tav | Trappu | Alicat | And the rest of Anthracite/Pygmalion for testing, feedback, and support.

Rei-12B | KTO

Downloads last month
65
Safetensors
Model size
12B params
Tensor type
BF16
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for Delta-Vector/Rei-V3-KTO-12B

Finetuned
(2)
this model
Merges
12 models
Quantizations
2 models

Datasets used to train Delta-Vector/Rei-V3-KTO-12B

Collections including Delta-Vector/Rei-V3-KTO-12B