EditReward: A Human-Aligned Reward Model for Instruction-Guided Image Editing

👁 Project Website
👁 arXiv
👁 Paper
👁 Model
👁 Dataset
👁 Benchmark

This repository contains the official implementation of the paper EditReward: A Human-Aligned Reward Model for Instruction-Guided Image Editing.

📖 Introduction

We introduce EditReward, a human-aligned reward model powered by a high-quality dataset for instruction-guided image editing. EditReward is trained with EditReward-Data, a large-scale, high-fidelity preference dataset comprising over 200K manually annotated preference pairs. This dataset covers diverse edits produced by seven state-of-the-art models across twelve distinct sources, ensuring high alignment with human judgment.

EditReward demonstrates superior alignment with human preferences in instruction-guided image editing tasks, achieving state-of-the-art human correlation on established benchmarks like GenAI-Bench, AURORA-Bench, ImagenHub, and our new EditReward-Bench.

👁 Teaser

🚀 Quick Start

To use the EditReward model for inference, follow these steps. For more details, including installation and training, please refer to the GitHub Repository.

💻 Installation

git clone https://github.com/TIGER-AI-Lab/EditReward.git
cd EditReward

conda create -n edit_reward python=3.10 -y
conda activate edit_reward
pip install torch==2.5.1 torchvision==0.20.1 torchaudio==2.5.1 --index-url https://download.pytorch.org/whl/cu124
pip install datasets pillow openai -U megfile sentencepiece deepspeed fire omegaconf matplotlib peft trl==0.8.6 tensorboard scipy transformers==4.56.1 accelerate
# Recommend: Install flash-attn
pip install https://github.com/Dao-AILab/flash-attention/releases/download/v2.7.2.post1/flash_attn-2.7.2.post1+cu12torch2.5cxx11abiFALSE-cp310-cp310-linux_x86_64.whl

🚀 Usage

import os
import sys
# Add project root to Python path (optional, for local development)
sys.path.append(os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
import torch
from EditReward import EditRewardInferencer

# ------------------------------------------------------------------------------
# Example script for evaluating edited images with EditReward
# ------------------------------------------------------------------------------

# Path to model checkpoint (update to your own local or HF path)
CHECKPOINT_PATH = "your/local/path/to/checkpoint"
CONFIG_PATH = "config/EditReward-MiMo-VL-7B-SFT-2508.yaml"

# Initialize reward model
inferencer = EditRewardInferencer(
 config_path=CONFIG_PATH,
 checkpoint_path=CHECKPOINT_PATH,
 device="cuda", # or "cpu"
 reward_dim="overall_detail", # choose reward dimension if applicable
 rm_head_type="ranknet_multi_head"
)

# Example input data -----------------------------------------------------------
# image_src = [
# "../assets/examples/source_img_1.png",
# "../assets/examples/source_img_1.png",
# ]

# image_paths = [
# "../assets/examples/target_img_1.png",
# "../assets/examples/target_img_2.png",
# ]
image_src = [
 "your/local/path/to/source_image_1.jpg",
 "your/local/path/to/source_image_2.jpg",
]

image_paths = [
 "your/local/path/to/edited_image_1.jpg",
 "your/local/path/to/edited_image_2.jpg",
]

# example instruction: "Add a green bowl on the branch"
# prompts = [
# "Add a green bowl on the branch",
# "Add a green bowl on the branch"
# ]
prompts = [
 "your first editing instruction",
 "your second editing instruction"
]

# ------------------------------------------------------------------------------
# Main evaluation modes
# ------------------------------------------------------------------------------
if __name__ == "__main__":
 mode = "pairwise_inference" # or "single_inference"

 if mode == "pairwise_inference":
 # ----------------------------------------------------------
 # Pairwise comparison: compares two edited images side-by-side
 # ----------------------------------------------------------
 with torch.no_grad():
 rewards = inferencer.reward(
 prompts=prompts,
 image_src=image_src,
 image_paths=image_paths
 )
 scores = [reward[0].item() for reward in rewards]
 print(f"[Pairwise Inference] Image scores: {scores}")

 elif mode == "single_inference":
 # ----------------------------------------------------------
 # Single image scoring: evaluates one edited image at a time
 # ----------------------------------------------------------
 with torch.no_grad():
 rewards = inferencer.reward(
 prompts=[prompts[0]],
 image_src=[image_src[0]],
 image_paths=[image_paths[0]]
 )
 print(f"[Single Inference] Image 1 score: {[reward[0].item() for reward in rewards]}")
 
 with torch.no_grad():
 rewards = inferencer.reward(
 prompts=[prompts[0]],
 image_src=[image_src[0]],
 image_paths=[image_paths[1]]
 )
 print(f"[Single Inference] Image 2 score: {[reward[0].item() for reward in rewards]}")

📊 Benchmark

EditReward achieves superior alignment with human preferences in instruction-guided image editing tasks. The following tables show its performance against other models on various benchmarks.

📚 Citation

Please kindly cite our paper if you use our code, data, models or results:

@article{wu2025editreward,
 title={EditReward: A Human-Aligned Reward Model for Instruction-Guided Image Editing},
 author={Wu, Keming and Jiang, Sicong and Ku, Max and Nie, Ping and Liu, Minghao and Chen, Wenhu},
 journal={arXiv preprint arXiv:2509.26346},
 year={2025}
}

🙏 Acknowledgements

We would like to thank the HPSv3, VideoAlign and GenAI-Bench codebase for providing valuable references.

⭐ Star History

👁 Star History Chart

💬 Support

For questions and support:

Issues: GitHub Issues
Email: wukeming0608@gmail.com & wenhuchen@uwaterloo.ca

Downloads last month: 72

Safetensors

Model size

8B params

Tensor type

BF16

Model tree for TIGER-Lab/EditReward-MiMo-VL-7B-SFT-2508

Unable to build the model tree, the base model loops to the model itself. Learn more.

Dataset used to train TIGER-Lab/EditReward-MiMo-VL-7B-SFT-2508

Collection including TIGER-Lab/EditReward-MiMo-VL-7B-SFT-2508

EditReward: A Human-Aligned Reward Model for Instruction-Guided Image Editing • 5 items • Updated Mar 2 • 5

Paper for TIGER-Lab/EditReward-MiMo-VL-7B-SFT-2508

Paper • 2509.26346 • Published Sep 30, 2025 • 19

URL: https://huggingface.co/TIGER-Lab/EditReward-MiMo-VL-7B-SFT-2508

⇱ TIGER-Lab/EditReward-MiMo-VL-7B-SFT-2508 · Hugging Face