Edit Flows base model

Training

Trained on a single 8xH100 node:

accelerate launch \
 --config_file scripts/accelerate_configs/zero2.yaml \
 examples/editflow/llada/adapt.py --model_name_or_path "GSAI-ML/LLaDA-8B-Instruct" \
 --lm_head_key "model.transformer.ff_out" \
 --init_editflow_from_src True \
 --per_device_train_batch_size 1 \
 --per_device_eval_batch_size 1 \
 --gradient_accumulation_steps 4 \
 --dataset_args "allenai/tulu-3-sft-mixture[train:500000]|lamm-mit/bio-silk-mech-mix-q-a-35K-messages-only|lamm-mit/graph_reasoning_v3_messages" \
 --output_dir "models/LlaDA-8B-EditFlow-instruct-v500" \
 --x0_sampler "masks[length:128]" --max_length 1500 \
 --num_train_epochs 4 \
 --learning_rate 1e-5 \
 --push_to_hub True --save_strategy "steps" --save_steps 1000 \
 --hub_model_id lamm-mit/LlaDA-8B-EditFlow-instruct-v500 \
 --hub_private_repo True --eval_strategy "no" \
 --warmup_steps 50

Sampling

python examples/editflow/sample.py \
 --model_name_or_path "odels/LlaDA-8B-EditFlow-instruct-v500" \
 --mask_length 128 --seed 7070 \
 --prompt "Define materiomics."

Downloads last month: 15

Safetensors

Model size

9B params

Tensor type

BF16

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for lamm-mit/LlaDA-8B-EditFlow-instruct-v500

Base model

GSAI-ML/LLaDA-8B-Instruct

Finetuned

(35)

this model

Finetunes

3 models

Datasets used to train lamm-mit/LlaDA-8B-EditFlow-instruct-v500

Collection including lamm-mit/LlaDA-8B-EditFlow-instruct-v500

Graph-native flow models (instruct to fine-tuned reasoning models) • 3 items • Updated 19 days ago • 2