VOOZH about

URL: https://huggingface.co/datasets/JingyeChen22/TextDiffuser-MARIO-10M

โ‡ฑ JingyeChen22/TextDiffuser-MARIO-10M ยท Datasets at Hugging Face


Dataset Viewer

YAML Metadata Warning:empty or missing yaml metadata in repo card

Check out the documentation for more information.

TextDiffuser-MARIO-10M

๐Ÿ‘ image/png

Dataset description

MARIO-10M is a dataset containing about 10 million text images, which includes a variety of sources such as book covers, posters, and tickets. Alongside the images, the dataset also provides OCR results and caption information.

Download

The downloading process include three steps:

[1] Download all the tar files

for i in {0..500};
do wget -O $i.tar.gz https://huggingface.co/datasets/JingyeChen22/TextDiffuser-MARIO-10M/resolve/main/$i.tar.gz?download=true;
done

[2] Unzip the top-level directory

for i in {0..500};
do tar -xvf $i.tar.gz --strip-components=5 && rm $i.tar.gz;
done

[3] Unzip the second-level directory

for i in {0..500};
do
 cd $i && for file in *.tar.gz; do tar -xvf "$file" --strip-components=5 && rm $file; done;
 cd ..;
done

Finally, the directory tree should show like this:

MARIO-10M/
โ”‚
โ”œโ”€โ”€ 0/
โ”‚ โ”œโ”€โ”€ 00000/
โ”‚ โ”œโ”€โ”€โ”€โ”€ 000000012/
โ”‚ โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€ caption.txt
โ”‚ โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€ charseg.npy
โ”‚ โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€ image.jpg
โ”‚ โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€ info.json
โ”‚ โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€ ocr.txt
...

Citation

If you find MARIO dataset useful in your research, please cite the following paper:

@article{chen2024textdiffuser,
 title={Textdiffuser: Diffusion models as text painters},
 author={Chen, Jingye and Huang, Yupan and Lv, Tengchao and Cui, Lei and Chen, Qifeng and Wei, Furu},
 journal={Advances in Neural Information Processing Systems},
 volume={36},
 year={2024}
}

@article{chen2023textdiffuser,
 title={Textdiffuser-2: Unleashing the power of language models for text rendering},
 author={Chen, Jingye and Huang, Yupan and Lv, Tengchao and Cui, Lei and Chen, Qifeng and Wei, Furu},
 journal={European Conference on Computer Vision},
 year={2024}
}

License

Microsoft Open Source Code of Conduct

Downloads last month
165