Dataset Viewer
YAML Metadata Warning:empty or missing yaml metadata in repo card
Check out the documentation for more information.
TextDiffuser-MARIO-10M
Dataset description
MARIO-10M is a dataset containing about 10 million text images, which includes a variety of sources such as book covers, posters, and tickets. Alongside the images, the dataset also provides OCR results and caption information.
Download
The downloading process include three steps:
[1] Download all the tar files
for i in {0..500};
do wget -O $i.tar.gz https://huggingface.co/datasets/JingyeChen22/TextDiffuser-MARIO-10M/resolve/main/$i.tar.gz?download=true;
done
[2] Unzip the top-level directory
for i in {0..500};
do tar -xvf $i.tar.gz --strip-components=5 && rm $i.tar.gz;
done
[3] Unzip the second-level directory
for i in {0..500};
do
cd $i && for file in *.tar.gz; do tar -xvf "$file" --strip-components=5 && rm $file; done;
cd ..;
done
Finally, the directory tree should show like this:
MARIO-10M/
โ
โโโ 0/
โ โโโ 00000/
โ โโโโโ 000000012/
โ โโโโโโโโโ caption.txt
โ โโโโโโโโโ charseg.npy
โ โโโโโโโโโ image.jpg
โ โโโโโโโโโ info.json
โ โโโโโโโโโ ocr.txt
...
Citation
If you find MARIO dataset useful in your research, please cite the following paper:
@article{chen2024textdiffuser,
title={Textdiffuser: Diffusion models as text painters},
author={Chen, Jingye and Huang, Yupan and Lv, Tengchao and Cui, Lei and Chen, Qifeng and Wei, Furu},
journal={Advances in Neural Information Processing Systems},
volume={36},
year={2024}
}
@article{chen2023textdiffuser,
title={Textdiffuser-2: Unleashing the power of language models for text rendering},
author={Chen, Jingye and Huang, Yupan and Lv, Tengchao and Cui, Lei and Chen, Qifeng and Wei, Furu},
journal={European Conference on Computer Vision},
year={2024}
}
License
- Downloads last month
- 165
