VOOZH about

URL: https://huggingface.co/datasets/changdae/vittle-pope-visual-perturbed

⇱ changdae/vittle-pope-visual-perturbed · Datasets at Hugging Face


Dataset Viewer

Vittle - Visually Perturbed POPE Benchmark

This dataset provides visually perturbed variants of the POPE (Polling-based Object Probing Evaluation) benchmark, built on COCO val2014 images. It is released as part of the Vittle (Visual Instruction Bottleneck Tuning) project (NeurIPS 2025).

Overview

  • Questions: 9,000 yes/no object hallucination probing questions (3,000 each for adversarial / popular / random splits)
  • Images: 500 unique COCO val2014 images, each with 9 visual perturbation variants (severity level 3)
  • Total image files: 4,500 (500 images x 9 perturbations)

Visual Perturbations

All perturbations are at severity level 3, generated following MM-Robustness:

Perturbation Folder
Gaussian Noise images/COCO_IP_gaussian_noise_3/
Shot Noise images/COCO_IP_shot_noise_3/
Speckle Noise images/COCO_IP_speckle_noise_3/
Fog images/COCO_IP_fog_3/
Contrast images/COCO_IP_contrast_3/
Brightness images/COCO_IP_brightness_3/
Defocus Blur images/COCO_IP_defocus_blur_3/
Zoom Blur images/COCO_IP_zoom_blur_3/
Frost images/COCO_IP_frost_3/

File Structure

.
├── README.md
├── llava_pope_test.jsonl # 9,000 questions
├── annotations/
│ ├── coco_pope_adversarial.json # 3,000 adversarial split labels
│ ├── coco_pope_popular.json # 3,000 popular split labels
│ └── coco_pope_random.json # 3,000 random split labels
└── images/
 ├── COCO_IP_gaussian_noise_3/ # 500 images
 ├── COCO_IP_shot_noise_3/
 ├── COCO_IP_speckle_noise_3/
 ├── COCO_IP_fog_3/
 ├── COCO_IP_contrast_3/
 ├── COCO_IP_brightness_3/
 ├── COCO_IP_defocus_blur_3/
 ├── COCO_IP_zoom_blur_3/
 └── COCO_IP_frost_3/

Question Format (JSONL)

{"question_id": 0, "image": "COCO_val2014_000000007991.jpg", "text": "Is there a snowboard in the image?\nAnswer the question using a single word or phrase.", "category": "adversarial"}

Citation

@inproceedings{
 oh2025visual,
 title={Visual Instruction Bottleneck Tuning},
 author={Changdae Oh and Jiatong Li and Shawn Im and Sharon Li},
 booktitle={The Thirty-ninth Annual Conference on Neural Information Processing Systems},
 year={2025},
 url={https://openreview.net/forum?id=yzHiEmLSk8}
}

License

MIT

Downloads last month
34

Papers for changdae/vittle-pope-visual-perturbed