Dataset Viewer

SemanticQA

A comprehensive benchmark for evaluating language models on semantic phrase processing, from the paper Revisiting a Pain in the Neck: Semantic Phrase Processing Benchmark for Language Models.

Usage

from datasets import load_dataset

# Load a specific subset
dataset = load_dataset("jacklanda/SemanticQA", "idiom_detection")

# Available configs:
# collocate_retrieval, collocation_categorization, collocation_extraction,
# collocation_paraphrase, idiom_detection, idiom_extraction, idiom_paraphrase,
# noun_compound_compositionality, noun_compound_compositionality_ft,
# noun_compound_extraction, noun_compound_interpretation, verbal_mwe_extraction

Subsets

Config	Task	Phrase Type	Size	Eval Metrics
`collocate_retrieval`	Collocate Retrieval (CR)	Collocation	306	Exact Match
`collocation_categorization`	Collocation Categorization (LCC)	Collocation	305	Accuracy, F1
`collocation_extraction`	Collocation Extraction (LCE)	Collocation	305	Exact Match
`collocation_paraphrase`	Collocation Interpretation (LCI)	Collocation	305	ROUGE-L, BERTScore, METEOR, BLEU
`idiom_detection`	Idiom Detection (IED)	Idiom	273	MCQ Accuracy
`idiom_extraction`	Idiom Extraction (IEE)	Idiom	447	Exact Match
`idiom_paraphrase`	Idiom Interpretation (IEI)	Idiom	818	ROUGE-L, BERTScore, METEOR, BLEU
`noun_compound_compositionality`	NC Compositionality (NCC)	Noun Compound	242	MCQ Accuracy
`noun_compound_compositionality_ft`	NCC Fine-tuning splits	Noun Compound	242	—
`noun_compound_extraction`	NC Extraction (NCE)	Noun Compound	720	Exact Match
`noun_compound_interpretation`	NC Interpretation (NCI)	Noun Compound	110	ROUGE-L, BERTScore, METEOR, BLEU
`verbal_mwe_extraction`	VMWE Extraction	Verbal MWE	475	Exact Match

Citation

@article{liu2026revisiting,
 title={Revisiting a Pain in the Neck: A Semantic Reasoning Benchmark for Language Models},
 author={Liu, Yang and Li, Hongming and Qin, Melissa Xiaohui and Liu, Qiankun and Huang, Chao},
 journal={arXiv preprint arXiv:2604.16593},
 year={2026}
}

@article{liu2024revisiting,
 title={Revisiting a Pain in the Neck: Semantic Phrase Processing Benchmark for Language Models},
 author={Liu, Yang and Qin, Melissa Xiaohui and Li, Hongming and Huang, Chao},
 journal={arXiv preprint arXiv:2405.02861},
 year={2024}
}

License

MIT

Downloads last month: 29

Papers for jacklanda/SemanticQA

Paper • 2604.16593 • Published Apr 17 • 6

Paper • 2405.02861 • Published May 5, 2024 • 1