Qwen2.5-Lumen-14B
- Qwen direct preference optimization finetuned for ~3 epochs.
A qwen2.5 preference finetune, targeting prompt adherence, storywriting and roleplay.
- Llama.cpp
(GGUF) Thanks QuantFactory
- static - GGUF
(GGUF) Thanks mradermacher
(GGUF) Thanks Triangle104
Other quant repositories also exist on huggingface and can be searched for.
Training Notes
Trained Qwen2.5-14B-Instruct for 2 epochs on NVidia A100, and on dataset jondurbin/gutenberg-dpo-v0.1, saving different checkpoints along the way (completely different runs at varying epochs and learning rates).
Tanliboy trained Qwen2.5-14B-Instruct for 1 epoch on HuggingFaceH4/ultrafeedback_binarized, (Credit to Tanliboy! Check out the model here)
Mass checkpoint merged, Based on Qwen2.5-14B-Instruct (Base Model).
Merge
Merged with a sophosympatheia's SLERP gradient "Ultrafeedback-Binarized DPO" and "Gutenberg DPO"
Merged with a sophosympatheia's SLERP gradient "Qwen2.5-14B-Instruct" and "Gutenberg DPO"
Merged all DPO checkpoints and SLERP variations with MODEL_STOCK to analyze geometric properties and get the most performant aspects of all runs/merges. Model Stock was chosen due to the similarity between the merged models.
This was chosen due to the fact that evaluation for ORPO is unclear, so it's hard to know which runs are the best.
One-Attempt generated example:
- Temp 1.3 [1], Min_P 0.012 [4], TFS 0.97 [3], Smooth_Factor 0.3 [2], Smoothing_Curve 1.1, Rep 1.1, Rep Range 1000
- Temp 1.3, Min_P 0.012, Rep 1.1
As you can see the model has mostly adapted to the intended response style from Gutenberg dataset.
Recipe
models:
- model: v000000/Qwen2.5-14B-Gutenberg-1e-Delta
- model: v000000/Qwen2.5-14B-Gutenberg-0.6e-Sequential
- model: v000000/Qwen2.5-14B-Gutenberg-0.25e-Early
- model: v000000/Qwen2.5-14B-Gutenberg-2e-Sequential
- model: v000000/Qwen2.5-14B-Gutenberg-0.37e-Early
- model: v000000/Qwen2.5-14B-Gutenberg-2e-Zeta
- model: v000000/Qwen2.5-14B-Gutenberg-1e-Theta
- model: tanliboy/lambda-qwen2.5-14b-dpo-test
- model: v000000/Qwen2.5-14B-Gutenberg-1e-Delta
- model: tanliboy/lambda-qwen2.5-14b-dpo-test
- model: v000000/Qwen2.5-14B-Gutenberg-UltraLambda-Slerpeno
- model: v000000/Qwen2.5-14B-Gutenberg-Instruct-Slerpeno
base_model: v000000/Qwen2.5-14B-Gutenberg-1e-Delta
merge_method: model_stock
dtype: bfloat16
If your use case is character-based roleplay, please consider using the prompts below for an enhanced experience
- For realistic RP/non-RPGs - MarinaraSpaghetti ChatML Customized
- For freeform RP/RPGs - MarinaraSpaghetti ChatML Basic
Finetune and merge
This is a merge and finetune of pre-trained language models.
Models Merged
The following models were included in the merge:
- v000000/Qwen2.5-14B-Gutenberg-1e-Delta
- v000000/Qwen2.5-14B-Gutenberg-0.6e-Sequential
- v000000/Qwen2.5-14B-Gutenberg-0.25e-Early
- v000000/Qwen2.5-14B-Gutenberg-2e-Sequential
- v000000/Qwen2.5-14B-Gutenberg-0.37e-Early
- v000000/Qwen2.5-14B-Gutenberg-2e-Zeta
- v000000/Qwen2.5-14B-Gutenberg-1e-Theta
- v000000/Qwen2.5-14B-Gutenberg-UltraLambda-Slerpeno
- v000000/Qwen2.5-14B-Gutenberg-Instruct-Slerpeno
- tanliboy/lambda-qwen2.5-14b-dpo-test
- Context Length: Full 131,072 tokens and generation 8192 tokens
- Qwen2(ChatML) Prompt format
Open LLM Leaderboard Evaluation Results
Detailed results can be found here
| Metric | Value |
|---|---|
| Avg. | 32.20 |
| IFEval (0-Shot) | 80.64 |
| BBH (3-Shot) | 48.51 |
| MATH Lvl 5 (4-Shot) | 0.00 |
| GPQA (0-shot) | 10.40 |
| MuSR (0-shot) | 10.29 |
| MMLU-PRO (5-shot) | 43.36 |
- Downloads last month
- 24
Model tree for v000000/Qwen2.5-Lumen-14B
Base model
Qwen/Qwen2.5-14BDatasets used to train v000000/Qwen2.5-Lumen-14B
Collections including v000000/Qwen2.5-Lumen-14B
Paper for v000000/Qwen2.5-Lumen-14B
Evaluation results
- strict accuracy on IFEval (0-Shot)Open LLM Leaderboard80.640
- normalized accuracy on BBH (3-Shot)Open LLM Leaderboard48.510
- exact match on MATH Lvl 5 (4-Shot)Open LLM Leaderboard0.000
- acc_norm on GPQA (0-shot)Open LLM Leaderboard10.400
- acc_norm on MuSR (0-shot)Open LLM Leaderboard10.290
- accuracy on MMLU-PRO (5-shot)test set Open LLM Leaderboard43.360
