Dataset Preview
model_id string | benchmark string | benchmark_key string | score float64 | source_type string | source_url string | contributor string | collected_at string |
|---|---|---|---|---|---|---|---|
allenai/Olmo-3-1125-32B | ARC MC | arc_mc | 94.7 | model-card | https://huggingface.co/allenai/Olmo-3-1125-32B | allenai | 2025-11-28T10:29:29.217305+00:00 |
allenai/Olmo-3-32B-Think | MMLU | mmlu | 85.4 | pull-request | https://huggingface.co/allenai/Olmo-3-32B-Think/discussions/10 | burtenshaw | 2025-11-28T10:29:20.486811+00:00 |
allenai/Olmo-3-1125-32B | MMLU | mmlu | 83.9 | model-card | https://huggingface.co/allenai/Olmo-3-1125-32B | allenai | 2025-11-28T10:29:29.217309+00:00 |
allenai/Olmo-3-1125-32B | BigCodeBench | bigcodebench | 43.9 | model-card | https://huggingface.co/allenai/Olmo-3-1125-32B | allenai | 2025-11-28T10:29:29.217285+00:00 |
Qwen/Qwen2.5-7B-Instruct | MMLU | mmlu | 36.55 | pull-request | https://huggingface.co/Qwen/Qwen2.5-7B-Instruct/discussions/4 | leaderboard-pr-bot | 2025-11-28T10:30:39.510190+00:00 |
null | null | null | null | null | null | null | null |
No dataset card yet
- Downloads last month
- 11
