🐦 Llama-3.1-8B-Magpie-Align-SFT-v0.2

Project Web: https://magpie-align.github.io/

Arxiv Technical Report: https://arxiv.org/abs/2406.08464

Codes: https://github.com/magpie-align/magpie

Abstract

About This Model

This model is a fine-tuned version of meta-llama/Meta-Llama-3.1-8B on

It achieves performance comparable with the official Llama-3.1-8B-Instruct Model with SFT only!

Alpaca Eval 2 (GPT-4-Turbo-1106): 20.66 (LC), 22.26 (WR)
Arena Hard: 22.2

Other Information

License: Please follow Meta Llama 3.1 Community License.

Conversation Template: Please use Llama 3 official chat template for the best performance.

Citation

If you find the model, data, or code useful, please cite our paper:

@article{xu2024magpie,
 title={Magpie: Alignment Data Synthesis from Scratch by Prompting Aligned LLMs with Nothing}, 
 author={Zhangchen Xu and Fengqing Jiang and Luyao Niu and Yuntian Deng and Radha Poovendran and Yejin Choi and Bill Yuchen Lin},
 year={2024},
 eprint={2406.08464},
 archivePrefix={arXiv},
 primaryClass={cs.CL}
}

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 2e-05
train_batch_size: 1
eval_batch_size: 1
seed: 42
distributed_type: multi-GPU
num_devices: 4
gradient_accumulation_steps: 32
total_train_batch_size: 128
total_eval_batch_size: 4
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: cosine
lr_scheduler_warmup_steps: 65
num_epochs: 2

Training results

Training Loss	Epoch	Step	Validation Loss
0.6921	0.0029	1	0.7830
0.4187	0.1998	69	0.4135
0.3744	0.3997	138	0.3695
0.36	0.5995	207	0.3549
0.3603	0.7993	276	0.3459
0.3517	0.9992	345	0.3407
0.3064	1.1881	414	0.3392
0.3149	1.3879	483	0.3378
0.304	1.5877	552	0.3372
0.3059	1.7876	621	0.3370
0.323	1.9874	690	0.3370