An open, billion-scale corpus of images interleaved with text.
• 2 items • Updated
Dataset Viewer
The core fewer faces subset of mmc4
| # images | # docs | # tokens | |
|---|---|---|---|
| Multimodal-C4 core fewer-faces (mmc4-core-ff) | 22.4M | 5.5M | 1.8B |
- Downloads last month
- 316
