4B • Updated • 4 • 7
The Grounding Claims Dataset is a multi-domain dataset for evaluating whether a natural language claim is grounded (i.e., supported or entailed) by a document. The dataset is organized into four subsets, each requiring different types of reasoning:
- general (1500 examples): Broad, everyday reasoning
- logical (1000 examples): Logical consistency and inference
- time_and_dates (100 examples): Temporal reasoning
- prices_and_math (100 examples): Numerical and mathematical reasoning
Each entry consists of:
doc: A short context or passageclaim: A natural language statement to verify against thedoclabel: A binary label indicating whether the claim is grounded in the document (1for grounded,0for ungrounded)dataset: The source subset name (e.g.,"general")
📌 Features
| Feature | Type | Description |
|---|---|---|
doc |
string | The document or passage providing the context |
claim |
string | A statement to verify against the document |
label |
string | grounded or ungrounded |
dataset |
string | The domain/subset the instance belongs to |
📊 Usage
This dataset can be used to train and evaluate models on factual verification, natural language inference (NLI), and claim grounding tasks across multiple domains.
🏷️ Labels
grounded— The claim is grounded in the document.ungrounded— The claim is ungrounded or contradicted by the document.
- Downloads last month
- 16
