VOOZH about

URL: https://huggingface.co/lamm-mit/LlaDA-8B-EditFlow-instruct-v500

⇱ lamm-mit/LlaDA-8B-EditFlow-instruct-v500 · Hugging Face


Edit Flows base model

Training

Trained on a single 8xH100 node:

accelerate launch \
 --config_file scripts/accelerate_configs/zero2.yaml \
 examples/editflow/llada/adapt.py --model_name_or_path "GSAI-ML/LLaDA-8B-Instruct" \
 --lm_head_key "model.transformer.ff_out" \
 --init_editflow_from_src True \
 --per_device_train_batch_size 1 \
 --per_device_eval_batch_size 1 \
 --gradient_accumulation_steps 4 \
 --dataset_args "allenai/tulu-3-sft-mixture[train:500000]|lamm-mit/bio-silk-mech-mix-q-a-35K-messages-only|lamm-mit/graph_reasoning_v3_messages" \
 --output_dir "models/LlaDA-8B-EditFlow-instruct-v500" \
 --x0_sampler "masks[length:128]" --max_length 1500 \
 --num_train_epochs 4 \
 --learning_rate 1e-5 \
 --push_to_hub True --save_strategy "steps" --save_steps 1000 \
 --hub_model_id lamm-mit/LlaDA-8B-EditFlow-instruct-v500 \
 --hub_private_repo True --eval_strategy "no" \
 --warmup_steps 50 

Sampling

python examples/editflow/sample.py \
 --model_name_or_path "odels/LlaDA-8B-EditFlow-instruct-v500" \
 --mask_length 128 --seed 7070 \
 --prompt "Define materiomics."
Downloads last month
15
Safetensors
Model size
9B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for lamm-mit/LlaDA-8B-EditFlow-instruct-v500

Finetuned
(35)
this model
Finetunes
3 models

Datasets used to train lamm-mit/LlaDA-8B-EditFlow-instruct-v500

Collection including lamm-mit/LlaDA-8B-EditFlow-instruct-v500