VOOZH about

URL: https://huggingface.co/datasets/nyu-visionx/scale-rae-data

⇱ nyu-visionx/scale-rae-data · Datasets at Hugging Face


Dataset Viewer

Scale RAE Data

Project Page | Paper | Code

This repository contains data associated with the paper "Scaling Text-to-Image Diffusion Transformers with Representation Autoencoders".

The dataset is used for training and evaluating Scale-RAE, a framework that investigates scaling Representation Autoencoders (RAEs) for large-scale, freeform text-to-image (T2I) generation. It includes data used for scaling RAE decoders beyond ImageNet, featuring web, synthetic, and text-rendering data, as well as high-quality instruction datasets for fine-tuning.

Citation

If you find this work useful, please cite:

@article{scale-rae-2026,
 title={Scaling Text-to-Image Diffusion Transformers with Representation Autoencoders},
 author={Shengbang Tong and Boyang Zheng and Ziteng Wang and Bingda Tang and Nanye Ma and Ellis Brown and Jihan Yang and Rob Fergus and Yann LeCun and Saining Xie},
 journal={arXiv preprint arXiv:2601.16208},
 year={2026}
}
Downloads last month
2,218

Collection including nyu-visionx/scale-rae-data

Paper for nyu-visionx/scale-rae-data