VOOZH about

URL: https://huggingface.co/TIGER-Lab/EditReward-MiMo-VL-7B-SFT-2508

โ‡ฑ TIGER-Lab/EditReward-MiMo-VL-7B-SFT-2508 ยท Hugging Face


๐Ÿ‘ Image

EditReward: A Human-Aligned Reward Model for Instruction-Guided Image Editing

๐Ÿ‘ Project Website
๐Ÿ‘ arXiv
๐Ÿ‘ Paper
๐Ÿ‘ Model
๐Ÿ‘ Dataset
๐Ÿ‘ Benchmark

This repository contains the official implementation of the paper EditReward: A Human-Aligned Reward Model for Instruction-Guided Image Editing.

๐Ÿ“– Introduction

We introduce EditReward, a human-aligned reward model powered by a high-quality dataset for instruction-guided image editing. EditReward is trained with EditReward-Data, a large-scale, high-fidelity preference dataset comprising over 200K manually annotated preference pairs. This dataset covers diverse edits produced by seven state-of-the-art models across twelve distinct sources, ensuring high alignment with human judgment.

EditReward demonstrates superior alignment with human preferences in instruction-guided image editing tasks, achieving state-of-the-art human correlation on established benchmarks like GenAI-Bench, AURORA-Bench, ImagenHub, and our new EditReward-Bench.

๐Ÿ‘ Teaser

๐Ÿš€ Quick Start

To use the EditReward model for inference, follow these steps. For more details, including installation and training, please refer to the GitHub Repository.

๐Ÿ’ป Installation

git clone https://github.com/TIGER-AI-Lab/EditReward.git
cd EditReward

conda create -n edit_reward python=3.10 -y
conda activate edit_reward
pip install torch==2.5.1 torchvision==0.20.1 torchaudio==2.5.1 --index-url https://download.pytorch.org/whl/cu124
pip install datasets pillow openai -U megfile sentencepiece deepspeed fire omegaconf matplotlib peft trl==0.8.6 tensorboard scipy transformers==4.56.1 accelerate
# Recommend: Install flash-attn
pip install https://github.com/Dao-AILab/flash-attention/releases/download/v2.7.2.post1/flash_attn-2.7.2.post1+cu12torch2.5cxx11abiFALSE-cp310-cp310-linux_x86_64.whl

๐Ÿš€ Usage

import os
import sys
# Add project root to Python path (optional, for local development)
sys.path.append(os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
import torch
from EditReward import EditRewardInferencer

# ------------------------------------------------------------------------------
# Example script for evaluating edited images with EditReward
# ------------------------------------------------------------------------------

# Path to model checkpoint (update to your own local or HF path)
CHECKPOINT_PATH = "your/local/path/to/checkpoint"
CONFIG_PATH = "config/EditReward-MiMo-VL-7B-SFT-2508.yaml"

# Initialize reward model
inferencer = EditRewardInferencer(
 config_path=CONFIG_PATH,
 checkpoint_path=CHECKPOINT_PATH,
 device="cuda", # or "cpu"
 reward_dim="overall_detail", # choose reward dimension if applicable
 rm_head_type="ranknet_multi_head"
)

# Example input data -----------------------------------------------------------
# image_src = [
# "../assets/examples/source_img_1.png",
# "../assets/examples/source_img_1.png",
# ]

# image_paths = [
# "../assets/examples/target_img_1.png",
# "../assets/examples/target_img_2.png",
# ]
image_src = [
 "your/local/path/to/source_image_1.jpg",
 "your/local/path/to/source_image_2.jpg",
]

image_paths = [
 "your/local/path/to/edited_image_1.jpg",
 "your/local/path/to/edited_image_2.jpg",
]

# example instruction: "Add a green bowl on the branch"
# prompts = [
# "Add a green bowl on the branch",
# "Add a green bowl on the branch"
# ]
prompts = [
 "your first editing instruction",
 "your second editing instruction"
]

# ------------------------------------------------------------------------------
# Main evaluation modes
# ------------------------------------------------------------------------------
if __name__ == "__main__":
 mode = "pairwise_inference" # or "single_inference"

 if mode == "pairwise_inference":
 # ----------------------------------------------------------
 # Pairwise comparison: compares two edited images side-by-side
 # ----------------------------------------------------------
 with torch.no_grad():
 rewards = inferencer.reward(
 prompts=prompts,
 image_src=image_src,
 image_paths=image_paths
 )
 scores = [reward[0].item() for reward in rewards]
 print(f"[Pairwise Inference] Image scores: {scores}")

 elif mode == "single_inference":
 # ----------------------------------------------------------
 # Single image scoring: evaluates one edited image at a time
 # ----------------------------------------------------------
 with torch.no_grad():
 rewards = inferencer.reward(
 prompts=[prompts[0]],
 image_src=[image_src[0]],
 image_paths=[image_paths[0]]
 )
 print(f"[Single Inference] Image 1 score: {[reward[0].item() for reward in rewards]}")
 
 with torch.no_grad():
 rewards = inferencer.reward(
 prompts=[prompts[0]],
 image_src=[image_src[0]],
 image_paths=[image_paths[1]]
 )
 print(f"[Single Inference] Image 2 score: {[reward[0].item() for reward in rewards]}")

๐Ÿ“Š Benchmark

EditReward achieves superior alignment with human preferences in instruction-guided image editing tasks. The following tables show its performance against other models on various benchmarks.



๐Ÿ“š Citation

Please kindly cite our paper if you use our code, data, models or results:

@article{wu2025editreward,
 title={EditReward: A Human-Aligned Reward Model for Instruction-Guided Image Editing},
 author={Wu, Keming and Jiang, Sicong and Ku, Max and Nie, Ping and Liu, Minghao and Chen, Wenhu},
 journal={arXiv preprint arXiv:2509.26346},
 year={2025}
}

๐Ÿ™ Acknowledgements

We would like to thank the HPSv3, VideoAlign and GenAI-Bench codebase for providing valuable references.


โญ Star History

๐Ÿ‘ Star History Chart

๐Ÿ’ฌ Support

For questions and support:

Downloads last month
72
Safetensors
Model size
8B params
Tensor type
BF16
ยท

Model tree for TIGER-Lab/EditReward-MiMo-VL-7B-SFT-2508

Unable to build the model tree, the base model loops to the model itself. Learn more.

Dataset used to train TIGER-Lab/EditReward-MiMo-VL-7B-SFT-2508

Collection including TIGER-Lab/EditReward-MiMo-VL-7B-SFT-2508

Paper for TIGER-Lab/EditReward-MiMo-VL-7B-SFT-2508