model_name string | organization string | license string | rating float64 | rating_lower float64 | rating_upper float64 | variance float64 | vote_count float64 | rank int64 | category string | leaderboard_publish_date string |
|---|---|---|---|---|---|---|---|---|---|---|
claude-fable-5 | anthropic | Proprietary | 1,508.162714 | 1,498.907926 | 1,517.417502 | 22.296503 | 4,366 | 1 | overall | 2026-06-25 |
claude-opus-4-6-thinking | anthropic | Proprietary | 1,503.086115 | 1,499.186668 | 1,506.985563 | 3.958312 | 51,769 | 2 | overall | 2026-06-25 |
claude-opus-4-7-thinking | anthropic | Proprietary | 1,502.307687 | 1,497.815951 | 1,506.799424 | 5.252092 | 38,326 | 3 | overall | 2026-06-25 |
claude-opus-4-6 | anthropic | Proprietary | 1,499.068978 | 1,495.266239 | 1,502.871716 | 3.764408 | 55,027 | 4 | overall | 2026-06-25 |
claude-opus-4-7 | anthropic | Proprietary | 1,493.802678 | 1,489.315395 | 1,498.289961 | 5.241683 | 39,550 | 5 | overall | 2026-06-25 |
muse-spark | meta | Proprietary | 1,487.27495 | 1,481.41017 | 1,493.13973 | 8.953798 | 13,598 | 6 | overall | 2026-06-25 |
gemini-3.1-pro-preview | google | Proprietary | 1,486.086602 | 1,482.37916 | 1,489.794045 | 3.578102 | 68,291 | 7 | overall | 2026-06-25 |
gemini-3-pro | google | Proprietary | 1,485.793839 | 1,481.938734 | 1,489.648944 | 3.8688 | 41,298 | 8 | overall | 2026-06-25 |
claude-opus-4-8-thinking | anthropic | Proprietary | 1,483.709658 | 1,477.951303 | 1,489.468013 | 8.631787 | 18,680 | 9 | overall | 2026-06-25 |
gpt-5.5-high | openai | Proprietary | 1,481.251334 | 1,476.510711 | 1,485.991957 | 5.850253 | 33,718 | 10 | overall | 2026-06-25 |
claude-opus-4-8 | anthropic | Proprietary | 1,479.055849 | 1,473.221198 | 1,484.8905 | 8.862037 | 19,038 | 11 | overall | 2026-06-25 |
gpt-5.4-high | openai | Proprietary | 1,477.752529 | 1,473.571539 | 1,481.933519 | 4.550532 | 46,702 | 12 | overall | 2026-06-25 |
gemini-3.5-flash | google | Proprietary | 1,476.433393 | 1,469.916838 | 1,482.949949 | 11.054523 | 10,159 | 13 | overall | 2026-06-25 |
gpt-5.2-chat-latest-20260210 | openai | Proprietary | 1,475.519582 | 1,471.365579 | 1,479.673584 | 4.491975 | 34,532 | 14 | overall | 2026-06-25 |
grok-4.20-beta-0309-reasoning | xai | Proprietary | 1,475.503672 | 1,471.445081 | 1,479.562262 | 4.287995 | 48,117 | 15 | overall | 2026-06-25 |
qwen3.7-max-preview | alibaba | Proprietary | 1,474.836761 | 1,464.824625 | 1,484.848898 | 26.095004 | 3,731 | 16 | overall | 2026-06-25 |
gpt-5.5 | openai | Proprietary | 1,474.823442 | 1,470.095253 | 1,479.551631 | 5.819605 | 34,794 | 17 | overall | 2026-06-25 |
grok-4.20-beta1 | xai | Proprietary | 1,474.432661 | 1,469.762263 | 1,479.103059 | 5.678212 | 26,945 | 18 | overall | 2026-06-25 |
glm-5.1 | zai | MIT | 1,473.418671 | 1,468.1774 | 1,478.659942 | 7.15117 | 19,620 | 19 | overall | 2026-06-25 |
gemini-3-flash | google | Proprietary | 1,473.246652 | 1,468.831333 | 1,477.66197 | 5.074906 | 30,704 | 20 | overall | 2026-06-25 |
claude-opus-4-5-20251101-thinking-32k | anthropic | Proprietary | 1,472.849092 | 1,468.957003 | 1,476.741182 | 3.943387 | 37,085 | 21 | overall | 2026-06-25 |
gpt-5.5-instant | openai | Proprietary | 1,472.504811 | 1,467.369872 | 1,477.63975 | 6.863954 | 26,215 | 22 | overall | 2026-06-25 |
claude-sonnet-4-6 | anthropic | Proprietary | 1,472.214993 | 1,468.152153 | 1,476.277834 | 4.296981 | 45,162 | 23 | overall | 2026-06-25 |
grok-4.20-multi-agent-beta-0309 | xai | Proprietary | 1,471.066499 | 1,466.989056 | 1,475.143942 | 4.327925 | 47,092 | 24 | overall | 2026-06-25 |
glm-5.2 (max) | zai | MIT | 1,469.869177 | 1,462.466532 | 1,477.271821 | 14.26519 | 7,552 | 25 | overall | 2026-06-25 |
claude-opus-4-5-20251101 | anthropic | Proprietary | 1,469.301235 | 1,466.083042 | 1,472.519428 | 2.696051 | 71,133 | 26 | overall | 2026-06-25 |
ernie-5.1 | baidu | Proprietary | 1,467.723269 | 1,462.826188 | 1,472.620351 | 6.242787 | 29,784 | 27 | overall | 2026-06-25 |
gpt-5.4 | openai | Proprietary | 1,467.445763 | 1,463.359715 | 1,471.53181 | 4.34621 | 49,383 | 28 | overall | 2026-06-25 |
mimo-v2.5-pro | xiaomi | MIT | 1,465.92161 | 1,461.112467 | 1,470.730753 | 6.020592 | 31,370 | 29 | overall | 2026-06-25 |
grok-4.1-thinking | xai | Proprietary | 1,465.843793 | 1,462.590786 | 1,469.0968 | 2.754697 | 65,587 | 30 | overall | 2026-06-25 |
qwen3.5-max-preview | alibaba | Proprietary | 1,465.145218 | 1,460.153571 | 1,470.136864 | 6.486217 | 21,551 | 31 | overall | 2026-06-25 |
qwen3.7-plus | alibaba | Proprietary | 1,463.827663 | 1,457.367868 | 1,470.287458 | 10.862789 | 11,899 | 32 | overall | 2026-06-25 |
kimi-k2.6 | moonshot | Modified MIT | 1,461.150439 | 1,456.35456 | 1,465.946317 | 5.987426 | 29,798 | 33 | overall | 2026-06-25 |
qwen3.6-max-preview | alibaba | Proprietary | 1,460.40797 | 1,452.0559 | 1,468.76004 | 18.159006 | 5,217 | 34 | overall | 2026-06-25 |
gemini-3-flash (thinking-minimal) | google | Proprietary | 1,459.929597 | 1,456.649692 | 1,463.209501 | 2.800439 | 72,062 | 35 | overall | 2026-06-25 |
grok-4.1 | xai | Proprietary | 1,459.383023 | 1,456.144724 | 1,462.621321 | 2.729842 | 67,730 | 36 | overall | 2026-06-25 |
glm-5 | zai | MIT | 1,457.495502 | 1,453.068553 | 1,461.922451 | 5.101675 | 26,794 | 37 | overall | 2026-06-25 |
deepseek-v4-pro-thinking | deepseek | MIT | 1,456.695826 | 1,451.926439 | 1,461.465213 | 5.921462 | 31,874 | 38 | overall | 2026-06-25 |
deepseek-v4-pro | deepseek | MIT | 1,456.584745 | 1,451.899519 | 1,461.269972 | 5.714326 | 33,792 | 39 | overall | 2026-06-25 |
claude-sonnet-4-5-20250929-thinking-32k | anthropic | Proprietary | 1,455.477746 | 1,452.657863 | 1,458.297628 | 2.069979 | 82,471 | 40 | overall | 2026-06-25 |
dola-seed-2.0-pro | bytedance | Proprietary | 1,455.095576 | 1,451.280973 | 1,458.910179 | 3.787935 | 55,915 | 41 | overall | 2026-06-25 |
claude-sonnet-4-5-20250929 | anthropic | Proprietary | 1,455.077778 | 1,452.146835 | 1,458.008721 | 2.236241 | 80,906 | 42 | overall | 2026-06-25 |
gpt-5.1-high | openai | Proprietary | 1,454.710279 | 1,450.924624 | 1,458.495933 | 3.730661 | 40,817 | 43 | overall | 2026-06-25 |
gemma-4-31b | google | Apache 2.0 | 1,451.060312 | 1,443.454618 | 1,458.666007 | 15.058494 | 5,892 | 44 | overall | 2026-06-25 |
kimi-k2.5-thinking | moonshot | Modified MIT | 1,449.789846 | 1,446.084269 | 1,453.495424 | 3.574502 | 52,551 | 45 | overall | 2026-06-25 |
claude-opus-4-1-20250805-thinking-16k | anthropic | Proprietary | 1,449.034561 | 1,445.583162 | 1,452.48596 | 3.100945 | 49,794 | 46 | overall | 2026-06-25 |
ernie-5.0-preview-1203 | baidu | Proprietary | 1,449.023179 | 1,442.484836 | 1,455.561522 | 11.128566 | 9,745 | 47 | overall | 2026-06-25 |
gpt-5.4-mini-high | openai | Proprietary | 1,449.017927 | 1,444.86805 | 1,453.167804 | 4.483057 | 45,301 | 48 | overall | 2026-06-25 |
gpt-5.3-chat-latest | openai | Proprietary | 1,448.670934 | 1,444.321466 | 1,453.020401 | 4.924657 | 33,090 | 49 | overall | 2026-06-25 |
mimo-v2-pro | xiaomi | Proprietary | 1,448.503695 | 1,443.734425 | 1,453.272964 | 5.92117 | 24,588 | 50 | overall | 2026-06-25 |
minimax-m3 | minimax | Proprietary | 1,447.337557 | 1,441.357152 | 1,453.317963 | 9.31033 | 16,510 | 51 | overall | 2026-06-25 |
ernie-5.0-0110 | baidu | Proprietary | 1,447.153579 | 1,443.151232 | 1,451.155925 | 4.169972 | 35,286 | 52 | overall | 2026-06-25 |
claude-opus-4-1-20250805 | anthropic | Proprietary | 1,447.037415 | 1,444.047927 | 1,450.026904 | 2.326471 | 77,297 | 53 | overall | 2026-06-25 |
gemini-2.5-pro | google | Proprietary | 1,445.715466 | 1,443.21202 | 1,448.218912 | 1.631475 | 124,541 | 54 | overall | 2026-06-25 |
gpt-4.5-preview-2025-02-27 | openai | Proprietary | 1,444.614835 | 1,438.919619 | 1,450.31005 | 8.443531 | 14,547 | 55 | overall | 2026-06-25 |
qwen3.6-plus | alibaba | Proprietary | 1,444.150102 | 1,439.56018 | 1,448.740024 | 5.484214 | 33,663 | 56 | overall | 2026-06-25 |
chatgpt-4o-latest-20250326 | openai | Proprietary | 1,443.199674 | 1,440.385302 | 1,446.014045 | 2.061896 | 82,431 | 57 | overall | 2026-06-25 |
qwen3.5-397b-a17b | alibaba | Apache 2.0 | 1,443.082146 | 1,439.204275 | 1,446.960018 | 3.914629 | 47,774 | 58 | overall | 2026-06-25 |
grok-4.3 | xai | Proprietary | 1,442.888363 | 1,438.222748 | 1,447.553979 | 5.666589 | 33,981 | 59 | overall | 2026-06-25 |
glm-4.7 | zai | MIT | 1,442.460738 | 1,436.361663 | 1,448.559814 | 9.683489 | 12,114 | 60 | overall | 2026-06-25 |
gpt-5.1 | openai | Proprietary | 1,438.835136 | 1,435.170233 | 1,442.500038 | 3.496462 | 43,443 | 61 | overall | 2026-06-25 |
gemma-4-26b-a4b | google | Apache 2.0 | 1,438.285365 | 1,430.645036 | 1,445.925695 | 15.195954 | 5,811 | 62 | overall | 2026-06-25 |
gpt-5.2-high | openai | Proprietary | 1,437.492823 | 1,433.844668 | 1,441.140978 | 3.464578 | 48,051 | 63 | overall | 2026-06-25 |
deepseek-v4-flash-thinking | deepseek | MIT | 1,437.296641 | 1,432.602486 | 1,441.990796 | 5.736126 | 33,381 | 64 | overall | 2026-06-25 |
longcat-flash-chat-2602-exp | meituan | Proprietary | 1,435.697929 | 1,431.050204 | 1,440.345653 | 5.623213 | 28,173 | 65 | overall | 2026-06-25 |
deepseek-v4-flash | deepseek | MIT | 1,435.140834 | 1,430.481936 | 1,439.799732 | 5.650283 | 33,456 | 66 | overall | 2026-06-25 |
gpt-5.2 | openai | Proprietary | 1,435.076451 | 1,431.640143 | 1,438.512759 | 3.073888 | 65,237 | 67 | overall | 2026-06-25 |
qwen3-max-preview | alibaba | Proprietary | 1,434.844597 | 1,430.345354 | 1,439.343839 | 5.26966 | 27,708 | 68 | overall | 2026-06-25 |
gpt-5-high | openai | Proprietary | 1,433.628171 | 1,429.117528 | 1,438.138814 | 5.296399 | 31,926 | 69 | overall | 2026-06-25 |
mimo-v2.5 | xiaomi | MIT | 1,433.535387 | 1,428.757486 | 1,438.313287 | 5.942621 | 32,112 | 70 | overall | 2026-06-25 |
kimi-k2.5-instant | moonshot | Modified MIT | 1,431.764367 | 1,425.174524 | 1,438.354209 | 11.304565 | 8,179 | 71 | overall | 2026-06-25 |
gemini-3.1-flash-lite-preview | google | Proprietary | 1,431.634948 | 1,427.788289 | 1,435.481608 | 3.851867 | 54,156 | 72 | overall | 2026-06-25 |
mimo-v2-omni | xiaomi | Proprietary | 1,431.110691 | 1,425.238166 | 1,436.983215 | 8.977461 | 17,536 | 73 | overall | 2026-06-25 |
grok-4-1-fast-reasoning | xai | Proprietary | 1,431.080711 | 1,427.81417 | 1,434.347253 | 2.777667 | 56,839 | 74 | overall | 2026-06-25 |
o3-2025-04-16 | openai | Proprietary | 1,431.021911 | 1,427.412757 | 1,434.631064 | 3.390896 | 59,742 | 75 | overall | 2026-06-25 |
kimi-k2-thinking-turbo | moonshot | Modified MIT | 1,429.910542 | 1,426.745389 | 1,433.075694 | 2.607913 | 62,067 | 76 | overall | 2026-06-25 |
amazon-nova-experimental-chat-26-02-10 | amazon | Proprietary | 1,427.002638 | 1,417.151602 | 1,436.853673 | 25.261991 | 3,419 | 77 | overall | 2026-06-25 |
mistral-medium-3.5 | mistral | Modified MIT | 1,426.982264 | 1,420.348991 | 1,433.615536 | 11.45406 | 10,777 | 78 | overall | 2026-06-25 |
gpt-5-chat | openai | Proprietary | 1,426.639182 | 1,422.35701 | 1,430.921354 | 4.773447 | 31,562 | 79 | overall | 2026-06-25 |
glm-4.6 | zai | MIT | 1,425.362364 | 1,421.441508 | 1,429.28322 | 4.001894 | 35,626 | 80 | overall | 2026-06-25 |
deepseek-v3.2 | deepseek | MIT | 1,425.094104 | 1,421.525141 | 1,428.663067 | 3.315796 | 47,278 | 81 | overall | 2026-06-25 |
deepseek-v3.2-exp-thinking | deepseek | MIT | 1,424.740857 | 1,418.187035 | 1,431.294679 | 11.18132 | 9,068 | 82 | overall | 2026-06-25 |
claude-opus-4-20250514-thinking-16k | anthropic | Proprietary | 1,424.388318 | 1,420.04384 | 1,428.732797 | 4.913366 | 36,886 | 83 | overall | 2026-06-25 |
qwen3-max-2025-09-23 | alibaba | Proprietary | 1,424.260476 | 1,417.810821 | 1,430.71013 | 10.828709 | 9,155 | 84 | overall | 2026-06-25 |
qwen3-235b-a22b-instruct-2507 | alibaba | Apache 2.0 | 1,423.198029 | 1,420.598444 | 1,425.797613 | 1.759186 | 97,193 | 85 | overall | 2026-06-25 |
deepseek-v3.2-thinking | deepseek | MIT | 1,423.083258 | 1,419.425518 | 1,426.740997 | 3.482807 | 41,077 | 86 | overall | 2026-06-25 |
deepseek-v3.2-exp | deepseek | MIT | 1,423.02209 | 1,416.618363 | 1,429.425816 | 10.675038 | 11,923 | 87 | overall | 2026-06-25 |
deepseek-r1-0528 | deepseek | MIT | 1,422.171106 | 1,416.514287 | 1,427.827924 | 8.330064 | 18,463 | 88 | overall | 2026-06-25 |
grok-4-fast-chat | xai | Proprietary | 1,420.821669 | 1,413.20105 | 1,428.442289 | 15.117652 | 6,810 | 89 | overall | 2026-06-25 |
nvidia-nemotron-3-ultra-550b-a55b-nvfp4 | nvidia | OpenMDW-1.1 | 1,420.455577 | 1,412.99481 | 1,427.916344 | 14.490079 | 7,484 | 90 | overall | 2026-06-25 |
ernie-5.0-preview-1022 | baidu | Proprietary | 1,419.123204 | 1,410.312644 | 1,427.933764 | 20.20742 | 4,704 | 91 | overall | 2026-06-25 |
kimi-k2-0905-preview | moonshot | Modified MIT | 1,417.878178 | 1,411.385549 | 1,424.370807 | 10.973496 | 11,771 | 92 | overall | 2026-06-25 |
deepseek-v3.1-terminus-thinking | deepseek | MIT | 1,417.551815 | 1,407.577642 | 1,427.525989 | 25.897488 | 3,461 | 93 | overall | 2026-06-25 |
kimi-k2-0711-preview | moonshot | Modified MIT | 1,417.462692 | 1,412.604723 | 1,422.32066 | 6.143461 | 27,629 | 94 | overall | 2026-06-25 |
deepseek-v3.1 | deepseek | MIT | 1,417.394343 | 1,411.384932 | 1,423.403754 | 9.400861 | 14,957 | 95 | overall | 2026-06-25 |
qwen3.5-122b-a10b | alibaba | Apache 2.0 | 1,417.270353 | 1,412.868127 | 1,421.672579 | 5.044853 | 28,566 | 96 | overall | 2026-06-25 |
deepseek-v3.1-thinking | deepseek | MIT | 1,417.153522 | 1,410.544325 | 1,423.76272 | 11.371069 | 11,730 | 97 | overall | 2026-06-25 |
minimax-m2.7 | minimax | Modified MIT | 1,416.652504 | 1,412.258019 | 1,421.046989 | 5.027126 | 39,341 | 98 | overall | 2026-06-25 |
deepseek-v3.1-terminus | deepseek | MIT | 1,416.068486 | 1,406.451144 | 1,425.685829 | 24.077645 | 3,700 | 99 | overall | 2026-06-25 |
amazon-nova-experimental-chat-26-01-10 | amazon | Proprietary | 1,416.028468 | 1,406.127414 | 1,425.929521 | 25.519175 | 3,407 | 100 | overall | 2026-06-25 |
End of preview.
Arena Leaderboard Dataset
Historical snapshots of the Arena leaderboard.
Usage
from datasets import load_dataset
# Load all historical text style control data
ds = load_dataset("lmarena-ai/leaderboard-dataset", "text_style_control", split="full")
# Load the current text style control leaderboard
ds = load_dataset("lmarena-ai/leaderboard-dataset", "text_style_control", split="latest")
# Filter to overall category
ds = load_dataset("lmarena-ai/leaderboard-dataset", "text_style_control", split="latest",
filters=[("category", "==", "overall")]
)
# Track a specific model's rating over time
ds = load_dataset("lmarena-ai/leaderboard-dataset", "text_style_control", split="full",
filters=[("category", "==", "overall"), ("model_name", "==", "gpt-4o-2024-05-13")],
columns=["model_name", "rating", "rank", "leaderboard_publish_date"]
)
# Load raw (non-style-controlled) text ratings
ds = load_dataset("lmarena-ai/leaderboard-dataset", "text", split="full")
Subsets
Each arena is a separate subset, arenas with style control have an additional subset with a _style_control suffix .
text,text_style_controlvision,vision_style_controlsearch,search_style_controldocument,document_style_controlwebdev(Code Arena)text_to_imageimage_edittext_to_videoimage_to_videovideo_edit
Splits
full: All historically published leaderboardslatest: Only the most recently published leaderboards
Notes
- On January 9, 2024, the rating system was updated from Elo to Bradley-Terry.
- On May 16, 2025, style control was made the default for text and vision arenas and the offset was adjusted to put the style control leaderboard on the same rating scale as non style control.
- On July 23, 2025, frequency-based re-weighting was implemented.
- Search and Webdev Arena data starts from their releases on the arena.ai (then lmarena.ai) on August 7 and November 12 2025 respectively.
For a complete record of all leaderboard methodology changes, see the Leaderboard Changelog.
Schema
| Column | Type | Description |
|---|---|---|
| model_name | string | Model identifier |
| organization | string | Model creator/organization |
| license | string | Model license |
| rating | float | Arena Score |
| rating_lower | float | Lower confidence bound |
| rating_upper | float | Upper confidence bound |
| variance | float | Rating variance |
| vote_count | int | Number of battles for this model |
| rank | int | Rank within this leaderboard |
| category | string | Leaderboard category (e.g., overall, coding, math) |
| leaderboard_publish_date | string | Date that this score was published (YYYY-MM-DD) |
- Downloads last month
- 9,907
