VOOZH about

URL: https://huggingface.co/kazuyamaa/DeepSeek-R1-Distill-Qwen-32B-axolotl-sft-v1.0

⇱ kazuyamaa/DeepSeek-R1-Distill-Qwen-32B-axolotl-sft-v1.0 · Hugging Face


👁 Built with Axolotl


DeepSeek-R1-Distill-Qwen-32B-axolotl-sft-v1.0

This model is a fine-tuned version of deepseek-ai/DeepSeek-R1-Distill-Qwen-32B on the kanhatakeyama/ramdom-to-fixed-multiturn-Calm3, the Aratako/Magpie-Tanuki-Qwen2.5-72B-Answered, the Aratako/magpie-qwen2.5-32b-reasoning-100k-formatted, the Aratako/magpie-reasoning-llama-nemotron-70b-100k-filtered, the Aratako/Open-Platypus-Japanese-masked-formatted, the kanhatakeyama/wizardlm8x22b-logical-math-coding-sft_additional-ja, the Aratako/magpie-ultra-v0.1-formatted, the Aratako/orca-agentinstruct-1M-v1-selected and the Aratako/Synthetic-JP-EN-Coding-Dataset-801k-50k datasets. It achieves the following results on the evaluation set: - Loss: 0.6154

以下、Axolotlの実行コード

!apt-get update
!apt-get install -y libopenmpi-dev

!git clone https://github.com/axolotl-ai-cloud/axolotl

cd axolotl
!pip install -e .
!pip install packaging ninja
!pip install flash-attn
!pip install deepspeed
!pip install mpi4py

# write権限のあるtokenを利用してHFにログイン(学習後のモデルアップロードに必要)
!huggingface-cli login --token WRITE ME
# wandbにログイン(wandbに学習ログを残したい場合)
!wandb login WRITE ME

import axolotl

!python -m axolotl.cli.preprocess /workspace/deepseek-32b-ver001-simpo.yml --debug

! accelerate launch -m axolotl.cli.train /workspace/deepseek-32b-ver001-simpo.yml --deepspeed deepspeed_configs/zero3_bf16.json

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0003
  • train_batch_size: 1
  • eval_batch_size: 1
  • seed: 42
  • distributed_type: multi-GPU
  • gradient_accumulation_steps: 16
  • total_train_batch_size: 16
  • optimizer: Use paged_adamw_8bit with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_steps: 10
  • num_epochs: 1.0

Training results

Training Loss Epoch Step Validation Loss
1.0196 0.0008 1 0.9386
0.732 0.0381 50 0.7104
0.7803 0.0763 100 0.6853
0.6013 0.1144 150 0.6712
0.6767 0.1526 200 0.6628
0.701 0.1907 250 0.6565
0.6976 0.2289 300 0.6520
0.7022 0.2670 350 0.6487
0.6889 0.3051 400 0.6449
0.6673 0.3433 450 0.6411
0.6067 0.3814 500 0.6382
0.644 0.4196 550 0.6357
0.9572 0.4577 600 0.6336
0.6466 0.4959 650 0.6310
0.6781 0.5340 700 0.6291
0.6473 0.5721 750 0.6274
0.6235 0.6103 800 0.6255
0.6564 0.6484 850 0.6238
0.6009 0.6866 900 0.6221
0.5759 0.7247 950 0.6208
0.5817 0.7628 1000 0.6197
0.6438 0.8010 1050 0.6190
0.6102 0.8391 1100 0.6180
0.5997 0.8773 1150 0.6170
0.5896 0.9154 1200 0.6164
0.5713 0.9536 1250 0.6158
0.6164 0.9917 1300 0.6154

Framework versions

  • PEFT 0.14.0
  • Transformers 4.49.0
  • Pytorch 2.5.1+cu124
  • Datasets 3.2.0
  • Tokenizers 0.21.0
Downloads last month
4
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for kazuyamaa/DeepSeek-R1-Distill-Qwen-32B-axolotl-sft-v1.0

Adapter
(220)
this model

Datasets used to train kazuyamaa/DeepSeek-R1-Distill-Qwen-32B-axolotl-sft-v1.0