Paper β’ 2509.07894 β’ Published β’ 32
index string | id string | context string | question string | marking string | answer string | answer_type string | unit string | points string | modality string | field string | source string | image_question string | information string | image images list |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | APhO_2025_1_A_1 | [Precession of the Earth's axis]
[Introduction]
It has been known since ancient times that the Earth's axis of rotation precesses. That is, the axis itself rotates around the line perpendicular to the ecliptic plane, i.e., the plane containing the Earth's orbit around the Sun. Ancient Greek astronomer Hipparchus co... | Find the values of exponents: (1) $\beta$, (2) $\gamma$, and (3) $\delta$. | [["Award 0.2 pt if the answer correctly expresses the dimension of $G$ as $[G] = L^3 M^{-1} T^{-2}$, where $L$ is the base dimensions length, $M$ is mass, and $T$ is time. Otherwise, award 0 pt.", "Award 0.1 pt if the answer correctly sets up the exponent equation $0 = 2 - \\beta$. Otherwise, award 0 pt.", "Award 0.1 p... | ["\\boxed{$\\beta = 2$}", "\\boxed{$\\gamma = -1$}", "\\boxed{$\\delta = 4$}"] | ["Numerical Value", "Numerical Value", "Numerical Value"] | [null, null, null] | [0.3, 0.3, 0.2] | text+illustration figure | Mechanics | APhO_2025 | /9j/4AAQSkZJRgABAQAAAQABAAD/2wBDAAgGBgcGBQgHBwcJCQgKDBQNDAsLDBkSEw8UHRofHh0aHBwgJC4nICIsIxwcKDcpLDAxNDQ0Hyc5PTgyPC4zNDL/2wBDAQkJCQwLDBgNDRgyIRwhMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjL/wAARCAKKBuEDASIAAhEBAxEB/8QAHwAAAQUBAQEBAQEAAAAAAAAAAAECAwQFBgcICQoL/8QAtRAAAgEDAwIEAwUFBAQAAAF9AQIDAAQRBRIh... | None. | |
1 | APhO_2025_1_A_2 | "[Precession of the Earth's axis] \n\n[Introduction] \n\nIt has been known since ancient times that (...TRUNCATED) | "Calculate the numerical value of $h_{\\text{max}}$ in $km$ assuming that the dimensionless factor i(...TRUNCATED) | "[[\"Award 0.1 pt if the answer correctly calculates $\\\\omega = \\\\frac{2\\\\pi}{24 h} = 7.27 \\\(...TRUNCATED) | ["\\boxed{21.9}"] | ["Numerical Value"] | ["km"] | [0.2] | text+illustration figure | Mechanics | APhO_2025 | "/9j/4AAQSkZJRgABAQAAAQABAAD/2wBDAAgGBgcGBQgHBwcJCQgKDBQNDAsLDBkSEw8UHRofHh0aHBwgJC4nICIsIxwcKDcpLDA(...TRUNCATED) | None. | [{"src":"https://datasets-server.huggingface.co/assets/HY-Wan/HiPhO/--/{dataset_git_revision}/--/def(...TRUNCATED) |
2 | APhO_2025_1_B_1 | "[Precession of the Earth's axis] \n\n[Introduction] \n\nIt has been known since ancient times that (...TRUNCATED) | "Find (1) the direction and (2) magnitude of the gravitational field generated by the Sun ring at a (...TRUNCATED) | "[[\"Award 0.2 pt if the answer correctly expresses the magnitude of $U(z) = -G \\\\frac{M_S}{\\\\sq(...TRUNCATED) | "[\"\\\\boxed{The direction of the gravitational field is toward the center of the Sun ring.}\", \"\(...TRUNCATED) | ["Open-Ended", "Expression"] | [null, null] | [0.2, 0.8] | text+illustration figure | Mechanics | APhO_2025 | "[\"/9j/4AAQSkZJRgABAQAAAQABAAD/2wBDAAgGBgcGBQgHBwcJCQgKDBQNDAsLDBkSEw8UHRofHh0aHBwgJC4nICIsIxwcKDcp(...TRUNCATED) | None. | [{"src":"https://datasets-server.huggingface.co/assets/HY-Wan/HiPhO/--/{dataset_git_revision}/--/def(...TRUNCATED) |
3 | APhO_2025_1_B_2 | "[Precession of the Earth's axis] \n\n[Introduction] \n\nIt has been known since ancient times that (...TRUNCATED) | "Find (1) the direction and (2) magnitude of the gravitational field generated by the Sun ring at a (...TRUNCATED) | "[[\"Award 0.5 pt if the answer presents the idea of using Gauss's law to calculate the gravitationa(...TRUNCATED) | "[\"\\\\boxed{The direction of the gravitational field is outward along the radial direction.}\", \"(...TRUNCATED) | ["Open-Ended", "Expression"] | [null, null] | [0.2, 2.0] | text+illustration figure | Mechanics | APhO_2025 | "[\"/9j/4AAQSkZJRgABAQAAAQABAAD/2wBDAAgGBgcGBQgHBwcJCQgKDBQNDAsLDBkSEw8UHRofHh0aHBwgJC4nICIsIxwcKDcp(...TRUNCATED) | None. | [{"src":"https://datasets-server.huggingface.co/assets/HY-Wan/HiPhO/--/{dataset_git_revision}/--/def(...TRUNCATED) |
4 | APhO_2025_1_C_1 | "[Precession of the Earth's axis] \n\n[Introduction] \n\nIt has been known since ancient times that (...TRUNCATED) | "Find the mass $m$ of one of the two excess regions indicated in Figure C.1. Express your answer in (...TRUNCATED) | "[[\"Award 0.2 pt if the answer includes the idea of transforming the ellipsoid of revolution into a(...TRUNCATED) | ["\\boxed{$m = \\frac{h_{\\max}}{2 R_p} M_E$}"] | ["Expression"] | [null] | [0.8] | text+variable figure | Mechanics | APhO_2025 | "[\"/9j/4AAQSkZJRgABAQAAAQABAAD/2wBDAAgGBgcGBQgHBwcJCQgKDBQNDAsLDBkSEw8UHRofHh0aHBwgJC4nICIsIxwcKDcp(...TRUNCATED) | None. | [{"src":"https://datasets-server.huggingface.co/assets/HY-Wan/HiPhO/--/{dataset_git_revision}/--/def(...TRUNCATED) |
5 | APhO_2025_1_C_2 | "[Precession of the Earth's axis] \n\n[Introduction] \n\nIt has been known since ancient times that (...TRUNCATED) | "Given this idea, find the torque $\\tau$ exerted by the Sun ring on the Earth. Express your answer (...TRUNCATED) | "[[\"Award 0.1 pt if the answer mentions that the net torque acting on a perfect sphere of radius $R(...TRUNCATED) | "[\"\\\\boxed{$|\\\\tau| = \\\\frac{3}{5} \\\\cdot \\\\frac{G M_E M_S}{d_{SE}^3} \\\\cdot R h_{\\\\m(...TRUNCATED) | ["Expression"] | [null] | [1.8] | text+variable figure | Mechanics | APhO_2025 | "[\"/9j/4AAQSkZJRgABAQAAAQABAAD/2wBDAAgGBgcGBQgHBwcJCQgKDBQNDAsLDBkSEw8UHRofHh0aHBwgJC4nICIsIxwcKDcp(...TRUNCATED) | None. | [{"src":"https://datasets-server.huggingface.co/assets/HY-Wan/HiPhO/--/{dataset_git_revision}/--/def(...TRUNCATED) |
6 | APhO_2025_1_D_1 | "[Precession of the Earth's axis] \n\n[Introduction] \n\nIt has been known since ancient times that (...TRUNCATED) | "Give an expression for the period $T_{1}$ of precession of the Earth's axis. Express your answer in(...TRUNCATED) | "[[\"Award 0.2 pt if the answer applies Newton's second law for rotational motion, $\\\\vec{\\\\tau}(...TRUNCATED) | "[\"\\\\boxed{$T_1 = \\\\frac{4 \\\\pi}{3} \\\\cdot \\\\frac{d_{SE}^3 R \\\\omega}{G M_S h_{\\\\max}(...TRUNCATED) | ["Expression"] | [null] | [1.8] | text+variable figure | Mechanics | APhO_2025 | "[\"/9j/4AAQSkZJRgABAQAAAQABAAD/2wBDAAgGBgcGBQgHBwcJCQgKDBQNDAsLDBkSEw8UHRofHh0aHBwgJC4nICIsIxwcKDcp(...TRUNCATED) | None. | [{"src":"https://datasets-server.huggingface.co/assets/HY-Wan/HiPhO/--/{dataset_git_revision}/--/def(...TRUNCATED) |
7 | APhO_2025_1_D_2 | "[Precession of the Earth's axis] \n\n[Introduction] \n\nIt has been known since ancient times that (...TRUNCATED) | Calculate the precession period $T_{1}$ in years. | "[[\"Award 0.2 pt if the answer gives the correct numerical result for the precession period as $T_1(...TRUNCATED) | ["\\boxed{80600}"] | ["Numerical Value"] | ["years"] | [0.2] | text+variable figure | Mechanics | APhO_2025 | "[\"/9j/4AAQSkZJRgABAQAAAQABAAD/2wBDAAgGBgcGBQgHBwcJCQgKDBQNDAsLDBkSEw8UHRofHh0aHBwgJC4nICIsIxwcKDcp(...TRUNCATED) | None. | [{"src":"https://datasets-server.huggingface.co/assets/HY-Wan/HiPhO/--/{dataset_git_revision}/--/def(...TRUNCATED) |
8 | APhO_2025_1_E_1 | "[Precession of the Earth's axis] \n\n[Introduction] \n\nIt has been known since ancient times that (...TRUNCATED) | "By what factor $T_{2} / T_{1}$ does the period of precession of the Earth's axis change if we also (...TRUNCATED) | "[[\"Award 0.3 pt if the answer explicitly states that the torques exerted by the Sun and the Moon a(...TRUNCATED) | ["\\boxed{$T_2 / T_1 = \\frac{M_S / d_{SE}^3}{M_M / d_{ME}^3 + M_S / d_{SE}^3}$}"] | ["Expression"] | [null] | [1.0] | text+variable figure | Mechanics | APhO_2025 | "[\"/9j/4AAQSkZJRgABAQAAAQABAAD/2wBDAAgGBgcGBQgHBwcJCQgKDBQNDAsLDBkSEw8UHRofHh0aHBwgJC4nICIsIxwcKDcp(...TRUNCATED) | None. | [{"src":"https://datasets-server.huggingface.co/assets/HY-Wan/HiPhO/--/{dataset_git_revision}/--/def(...TRUNCATED) |
9 | APhO_2025_1_E_2 | "[Precession of the Earth's axis] \n\n[Introduction] \n\nIt has been known since ancient times that (...TRUNCATED) | By substituting the data, calculate the period of precession $T_{2}$ in years. | "[[\"Award 0.2 pt if the answer gives the correct numerical result for the precession period $T_2 \\(...TRUNCATED) | ["\\boxed{25400}"] | ["Numerical Value"] | ["years"] | [0.2] | text+variable figure | Mechanics | APhO_2025 | "[\"/9j/4AAQSkZJRgABAQAAAQABAAD/2wBDAAgGBgcGBQgHBwcJCQgKDBQNDAsLDBkSEw8UHRofHh0aHBwgJC4nICIsIxwcKDcp(...TRUNCATED) | None. | [{"src":"https://datasets-server.huggingface.co/assets/HY-Wan/HiPhO/--/{dataset_git_revision}/--/def(...TRUNCATED) |
End of preview. Expand in Data Studio
π₯ HiPhO: High School Physics Olympiad Benchmark
π New (Sep. 16): We launched "PhyArena", a physics reasoning leaderboard incorporating the HiPhO benchmark.
π Introduction
HiPhO (High School Physics Olympiad Benchmark) is the first benchmark specifically designed to evaluate the physical reasoning abilities of (M)LLMs on real-world Physics Olympiads from 2024β2025.
β¨ Key Features
- Up-to-date Coverage: Includes 13 Olympiad exam papers from 2024β2025 across international and regional competitions.
- Mixed-modal Content: Supports four modality types, spanning from text-only to diagram-based problems.
- Professional Evaluation: Uses official marking schemes for answer-level and step-level grading.
- Human-level Comparison: Maps model scores to medal levels (Gold/Silver/Bronze) and compares with human performance.
π IPhO 2025 (Theory) Results
- Top-1 Human Score: 29.2 / 30.0
- Top-1 Model Score: 22.7 / 29.4 (Gemini-2.5-Pro)
- Gold Threshold: 19.7
- Silver Threshold: 12.1
- Bronze Threshold: 7.2
Although models like Gemini-2.5-Pro and GPT-5 achieved gold-level scores, they still fall noticeably short of the very best human contestants.
π Dataset Overview
HiPhO contains:
- 13 Physics Olympiads
- 360 Problems
- Categorized across:
- 5 Physics Fields: Mechanics, Electromagnetism, Thermodynamics, Optics, Modern Physics
- 4 Modality Types: Text-Only, Text+Illustration Figure, Text+Variable Figure, Text+Data Figure
- 6 Answer Types: Expression, Numerical Value, Multiple Choice, Equation, Open-Ended, Inequality
Evaluation is conducted using:
- Answer-level and step-level scoring, aligned with official marking schemes
- Exam score as the evaluation metric
- Medal-based comparison, using official thresholds for gold, silver, and bronze
πΌοΈ Modality Categorization
- π Text-Only (TO): Pure text, no figures
- π― Text+Illustration Figure (TI): Figures illustrate physical setups
- π Text+Variable Figure (TV): Figures define key variables or geometry
- π Text+Data Figure (TD): Figures show plots, data, or functions absent from text
As models move from TO β TD, performance drops sharplyβhighlighting the challenges of visual reasoning.
π Main Results
- Closed-source reasoning MLLMs lead the benchmark, earning 6β12 gold medals (Top 5: Gemini-2.5-Pro, Gemini-2.5-Flash-Thinking, GPT-5, o3, Grok-4)
- Open-source MLLMs mostly score at or below the bronze level
- Open-source LLMs demonstrate stronger reasoning and generally outperform open-source MLLMs
π₯ Download
- Dataset & Annotations: https://huggingface.co/datasets/SciYu/HiPhO
- GitHub Repository: https://github.com/SciYu/HiPhO
- π Paper: https://arxiv.org/abs/2509.07894
- π§ Contact: fangchenyu@link.cuhk.edu.cn
π Citation
@article{hipho2025,
title={HiPhO: How Far Are (M)LLMs from Humans in the Latest High School Physics Olympiad Benchmark?},
author={Yu, Fangchen and Wan, Haiyuan and Cheng, Qianjia and Zhang, Yuchen and Chen, Jiacheng and Han, Fujun and Wu, Yulun and Yao, Junchi and Hu, Ruilizhen and Ding, Ning and Cheng, Yu and Chen, Tao and Bai, Lei and Zhou, Dongzhan and Luo, Yun and Cui, Ganqu and Ye, Peng},
journal={arXiv preprint arXiv:2509.07894},
year={2025}
}
- Downloads last month
- 183
