VOOZH about

URL: https://huggingface.co/datasets/lewtun/openthoughts-100

⇱ lewtun/openthoughts-100 · Datasets at Hugging Face


Dataset Viewer
Duplicate

OpenThoughts-100

A 100-sample subset of the open-thoughts/OpenThoughts-114k dataset.

Description

This is a small, reproducible subset of the OpenThoughts synthetic reasoning dataset containing 100 randomly selected examples. It was created by shuffling the original dataset with seed 42 and selecting the first 100 examples.

Dataset Details

Usage

from datasets import load_dataset

ds = load_dataset("lewtun/openthoughts-100", split="train")
print(len(ds)) # 100

Format

The dataset is stored as a JSONL file with the following schema:

  • system: System prompt instructing the model to use long thinking process
  • conversations: List of message turns with from (user/assistant) and value fields
Downloads last month
33