Voozh

Add evaluation results for HLE, MMLU-Pro

by SaylorTwift HF Staff - opened Feb 16

←

Qwen org Feb 16

 This PR adds evaluation results extracted from the Model Card.

 **Benchmarks:**
 - MMLU-Pro: 87.8

HLE: 48.3

 **Files created:**
 - .eval_results/mmlu_pro.yaml

.eval_results/hle_with_tools.yaml

 ---

 Extracted automatically using the [LLM-powered evaluation extractor](https://github.com/huggingface/community-evals).

Feb 20

Ssss

Ready to merge

This branch is ready to get merged automatically.

· Sign up or log in to comment