9 items • Updated • 12
Fineweb subset
[Work-In-Progress] Filtering by how a knowledge could be useful for pretraining small model
I'm targeting only 3-4% of original fineweb in size.
- Downloads last month
- 173
![]() |
VOOZH | about |
[Work-In-Progress] Filtering by how a knowledge could be useful for pretraining small model
I'm targeting only 3-4% of original fineweb in size.