VOOZH about

URL: https://huggingface.co/datasets/MiniLLM/pile-diff_samp-qwen_1.8B-qwen_104M-r0.5

⇱ MiniLLM/pile-diff_samp-qwen_1.8B-qwen_104M-r0.5 · Datasets at Hugging Face


Dataset Viewer

The dataset viewer is not available because its heuristics could not detect any supported data files. You can try uploading some data files, or configuring the data files location manually.

This repository contains the refined pre-training corpus from the paper MiniPLM: Knowledge Distillation for Pre-Training Language Models.

Code: https://github.com/thu-coai/MiniPLM

Downloads last month
282

Models trained or fine-tuned on MiniLLM/pile-diff_samp-qwen_1.8B-qwen_104M-r0.5

Paper for MiniLLM/pile-diff_samp-qwen_1.8B-qwen_104M-r0.5