VOOZH about

URL: https://huggingface.co/datasets/lmarena-ai/leaderboard-dataset

⇱ lmarena-ai/leaderboard-dataset · Datasets at Hugging Face


model_name
string
organization
string
license
string
rating
float64
rating_lower
float64
rating_upper
float64
variance
float64
vote_count
float64
rank
int64
category
string
leaderboard_publish_date
string
claude-fable-5
anthropic
Proprietary
1,508.162714
1,498.907926
1,517.417502
22.296503
4,366
1
overall
2026-06-25
claude-opus-4-6-thinking
anthropic
Proprietary
1,503.086115
1,499.186668
1,506.985563
3.958312
51,769
2
overall
2026-06-25
claude-opus-4-7-thinking
anthropic
Proprietary
1,502.307687
1,497.815951
1,506.799424
5.252092
38,326
3
overall
2026-06-25
claude-opus-4-6
anthropic
Proprietary
1,499.068978
1,495.266239
1,502.871716
3.764408
55,027
4
overall
2026-06-25
claude-opus-4-7
anthropic
Proprietary
1,493.802678
1,489.315395
1,498.289961
5.241683
39,550
5
overall
2026-06-25
muse-spark
meta
Proprietary
1,487.27495
1,481.41017
1,493.13973
8.953798
13,598
6
overall
2026-06-25
gemini-3.1-pro-preview
google
Proprietary
1,486.086602
1,482.37916
1,489.794045
3.578102
68,291
7
overall
2026-06-25
gemini-3-pro
google
Proprietary
1,485.793839
1,481.938734
1,489.648944
3.8688
41,298
8
overall
2026-06-25
claude-opus-4-8-thinking
anthropic
Proprietary
1,483.709658
1,477.951303
1,489.468013
8.631787
18,680
9
overall
2026-06-25
gpt-5.5-high
openai
Proprietary
1,481.251334
1,476.510711
1,485.991957
5.850253
33,718
10
overall
2026-06-25
claude-opus-4-8
anthropic
Proprietary
1,479.055849
1,473.221198
1,484.8905
8.862037
19,038
11
overall
2026-06-25
gpt-5.4-high
openai
Proprietary
1,477.752529
1,473.571539
1,481.933519
4.550532
46,702
12
overall
2026-06-25
gemini-3.5-flash
google
Proprietary
1,476.433393
1,469.916838
1,482.949949
11.054523
10,159
13
overall
2026-06-25
gpt-5.2-chat-latest-20260210
openai
Proprietary
1,475.519582
1,471.365579
1,479.673584
4.491975
34,532
14
overall
2026-06-25
grok-4.20-beta-0309-reasoning
xai
Proprietary
1,475.503672
1,471.445081
1,479.562262
4.287995
48,117
15
overall
2026-06-25
qwen3.7-max-preview
alibaba
Proprietary
1,474.836761
1,464.824625
1,484.848898
26.095004
3,731
16
overall
2026-06-25
gpt-5.5
openai
Proprietary
1,474.823442
1,470.095253
1,479.551631
5.819605
34,794
17
overall
2026-06-25
grok-4.20-beta1
xai
Proprietary
1,474.432661
1,469.762263
1,479.103059
5.678212
26,945
18
overall
2026-06-25
glm-5.1
zai
MIT
1,473.418671
1,468.1774
1,478.659942
7.15117
19,620
19
overall
2026-06-25
gemini-3-flash
google
Proprietary
1,473.246652
1,468.831333
1,477.66197
5.074906
30,704
20
overall
2026-06-25
claude-opus-4-5-20251101-thinking-32k
anthropic
Proprietary
1,472.849092
1,468.957003
1,476.741182
3.943387
37,085
21
overall
2026-06-25
gpt-5.5-instant
openai
Proprietary
1,472.504811
1,467.369872
1,477.63975
6.863954
26,215
22
overall
2026-06-25
claude-sonnet-4-6
anthropic
Proprietary
1,472.214993
1,468.152153
1,476.277834
4.296981
45,162
23
overall
2026-06-25
grok-4.20-multi-agent-beta-0309
xai
Proprietary
1,471.066499
1,466.989056
1,475.143942
4.327925
47,092
24
overall
2026-06-25
glm-5.2 (max)
zai
MIT
1,469.869177
1,462.466532
1,477.271821
14.26519
7,552
25
overall
2026-06-25
claude-opus-4-5-20251101
anthropic
Proprietary
1,469.301235
1,466.083042
1,472.519428
2.696051
71,133
26
overall
2026-06-25
ernie-5.1
baidu
Proprietary
1,467.723269
1,462.826188
1,472.620351
6.242787
29,784
27
overall
2026-06-25
gpt-5.4
openai
Proprietary
1,467.445763
1,463.359715
1,471.53181
4.34621
49,383
28
overall
2026-06-25
mimo-v2.5-pro
xiaomi
MIT
1,465.92161
1,461.112467
1,470.730753
6.020592
31,370
29
overall
2026-06-25
grok-4.1-thinking
xai
Proprietary
1,465.843793
1,462.590786
1,469.0968
2.754697
65,587
30
overall
2026-06-25
qwen3.5-max-preview
alibaba
Proprietary
1,465.145218
1,460.153571
1,470.136864
6.486217
21,551
31
overall
2026-06-25
qwen3.7-plus
alibaba
Proprietary
1,463.827663
1,457.367868
1,470.287458
10.862789
11,899
32
overall
2026-06-25
kimi-k2.6
moonshot
Modified MIT
1,461.150439
1,456.35456
1,465.946317
5.987426
29,798
33
overall
2026-06-25
qwen3.6-max-preview
alibaba
Proprietary
1,460.40797
1,452.0559
1,468.76004
18.159006
5,217
34
overall
2026-06-25
gemini-3-flash (thinking-minimal)
google
Proprietary
1,459.929597
1,456.649692
1,463.209501
2.800439
72,062
35
overall
2026-06-25
grok-4.1
xai
Proprietary
1,459.383023
1,456.144724
1,462.621321
2.729842
67,730
36
overall
2026-06-25
glm-5
zai
MIT
1,457.495502
1,453.068553
1,461.922451
5.101675
26,794
37
overall
2026-06-25
deepseek-v4-pro-thinking
deepseek
MIT
1,456.695826
1,451.926439
1,461.465213
5.921462
31,874
38
overall
2026-06-25
deepseek-v4-pro
deepseek
MIT
1,456.584745
1,451.899519
1,461.269972
5.714326
33,792
39
overall
2026-06-25
claude-sonnet-4-5-20250929-thinking-32k
anthropic
Proprietary
1,455.477746
1,452.657863
1,458.297628
2.069979
82,471
40
overall
2026-06-25
dola-seed-2.0-pro
bytedance
Proprietary
1,455.095576
1,451.280973
1,458.910179
3.787935
55,915
41
overall
2026-06-25
claude-sonnet-4-5-20250929
anthropic
Proprietary
1,455.077778
1,452.146835
1,458.008721
2.236241
80,906
42
overall
2026-06-25
gpt-5.1-high
openai
Proprietary
1,454.710279
1,450.924624
1,458.495933
3.730661
40,817
43
overall
2026-06-25
gemma-4-31b
google
Apache 2.0
1,451.060312
1,443.454618
1,458.666007
15.058494
5,892
44
overall
2026-06-25
kimi-k2.5-thinking
moonshot
Modified MIT
1,449.789846
1,446.084269
1,453.495424
3.574502
52,551
45
overall
2026-06-25
claude-opus-4-1-20250805-thinking-16k
anthropic
Proprietary
1,449.034561
1,445.583162
1,452.48596
3.100945
49,794
46
overall
2026-06-25
ernie-5.0-preview-1203
baidu
Proprietary
1,449.023179
1,442.484836
1,455.561522
11.128566
9,745
47
overall
2026-06-25
gpt-5.4-mini-high
openai
Proprietary
1,449.017927
1,444.86805
1,453.167804
4.483057
45,301
48
overall
2026-06-25
gpt-5.3-chat-latest
openai
Proprietary
1,448.670934
1,444.321466
1,453.020401
4.924657
33,090
49
overall
2026-06-25
mimo-v2-pro
xiaomi
Proprietary
1,448.503695
1,443.734425
1,453.272964
5.92117
24,588
50
overall
2026-06-25
minimax-m3
minimax
Proprietary
1,447.337557
1,441.357152
1,453.317963
9.31033
16,510
51
overall
2026-06-25
ernie-5.0-0110
baidu
Proprietary
1,447.153579
1,443.151232
1,451.155925
4.169972
35,286
52
overall
2026-06-25
claude-opus-4-1-20250805
anthropic
Proprietary
1,447.037415
1,444.047927
1,450.026904
2.326471
77,297
53
overall
2026-06-25
gemini-2.5-pro
google
Proprietary
1,445.715466
1,443.21202
1,448.218912
1.631475
124,541
54
overall
2026-06-25
gpt-4.5-preview-2025-02-27
openai
Proprietary
1,444.614835
1,438.919619
1,450.31005
8.443531
14,547
55
overall
2026-06-25
qwen3.6-plus
alibaba
Proprietary
1,444.150102
1,439.56018
1,448.740024
5.484214
33,663
56
overall
2026-06-25
chatgpt-4o-latest-20250326
openai
Proprietary
1,443.199674
1,440.385302
1,446.014045
2.061896
82,431
57
overall
2026-06-25
qwen3.5-397b-a17b
alibaba
Apache 2.0
1,443.082146
1,439.204275
1,446.960018
3.914629
47,774
58
overall
2026-06-25
grok-4.3
xai
Proprietary
1,442.888363
1,438.222748
1,447.553979
5.666589
33,981
59
overall
2026-06-25
glm-4.7
zai
MIT
1,442.460738
1,436.361663
1,448.559814
9.683489
12,114
60
overall
2026-06-25
gpt-5.1
openai
Proprietary
1,438.835136
1,435.170233
1,442.500038
3.496462
43,443
61
overall
2026-06-25
gemma-4-26b-a4b
google
Apache 2.0
1,438.285365
1,430.645036
1,445.925695
15.195954
5,811
62
overall
2026-06-25
gpt-5.2-high
openai
Proprietary
1,437.492823
1,433.844668
1,441.140978
3.464578
48,051
63
overall
2026-06-25
deepseek-v4-flash-thinking
deepseek
MIT
1,437.296641
1,432.602486
1,441.990796
5.736126
33,381
64
overall
2026-06-25
longcat-flash-chat-2602-exp
meituan
Proprietary
1,435.697929
1,431.050204
1,440.345653
5.623213
28,173
65
overall
2026-06-25
deepseek-v4-flash
deepseek
MIT
1,435.140834
1,430.481936
1,439.799732
5.650283
33,456
66
overall
2026-06-25
gpt-5.2
openai
Proprietary
1,435.076451
1,431.640143
1,438.512759
3.073888
65,237
67
overall
2026-06-25
qwen3-max-preview
alibaba
Proprietary
1,434.844597
1,430.345354
1,439.343839
5.26966
27,708
68
overall
2026-06-25
gpt-5-high
openai
Proprietary
1,433.628171
1,429.117528
1,438.138814
5.296399
31,926
69
overall
2026-06-25
mimo-v2.5
xiaomi
MIT
1,433.535387
1,428.757486
1,438.313287
5.942621
32,112
70
overall
2026-06-25
kimi-k2.5-instant
moonshot
Modified MIT
1,431.764367
1,425.174524
1,438.354209
11.304565
8,179
71
overall
2026-06-25
gemini-3.1-flash-lite-preview
google
Proprietary
1,431.634948
1,427.788289
1,435.481608
3.851867
54,156
72
overall
2026-06-25
mimo-v2-omni
xiaomi
Proprietary
1,431.110691
1,425.238166
1,436.983215
8.977461
17,536
73
overall
2026-06-25
grok-4-1-fast-reasoning
xai
Proprietary
1,431.080711
1,427.81417
1,434.347253
2.777667
56,839
74
overall
2026-06-25
o3-2025-04-16
openai
Proprietary
1,431.021911
1,427.412757
1,434.631064
3.390896
59,742
75
overall
2026-06-25
kimi-k2-thinking-turbo
moonshot
Modified MIT
1,429.910542
1,426.745389
1,433.075694
2.607913
62,067
76
overall
2026-06-25
amazon-nova-experimental-chat-26-02-10
amazon
Proprietary
1,427.002638
1,417.151602
1,436.853673
25.261991
3,419
77
overall
2026-06-25
mistral-medium-3.5
mistral
Modified MIT
1,426.982264
1,420.348991
1,433.615536
11.45406
10,777
78
overall
2026-06-25
gpt-5-chat
openai
Proprietary
1,426.639182
1,422.35701
1,430.921354
4.773447
31,562
79
overall
2026-06-25
glm-4.6
zai
MIT
1,425.362364
1,421.441508
1,429.28322
4.001894
35,626
80
overall
2026-06-25
deepseek-v3.2
deepseek
MIT
1,425.094104
1,421.525141
1,428.663067
3.315796
47,278
81
overall
2026-06-25
deepseek-v3.2-exp-thinking
deepseek
MIT
1,424.740857
1,418.187035
1,431.294679
11.18132
9,068
82
overall
2026-06-25
claude-opus-4-20250514-thinking-16k
anthropic
Proprietary
1,424.388318
1,420.04384
1,428.732797
4.913366
36,886
83
overall
2026-06-25
qwen3-max-2025-09-23
alibaba
Proprietary
1,424.260476
1,417.810821
1,430.71013
10.828709
9,155
84
overall
2026-06-25
qwen3-235b-a22b-instruct-2507
alibaba
Apache 2.0
1,423.198029
1,420.598444
1,425.797613
1.759186
97,193
85
overall
2026-06-25
deepseek-v3.2-thinking
deepseek
MIT
1,423.083258
1,419.425518
1,426.740997
3.482807
41,077
86
overall
2026-06-25
deepseek-v3.2-exp
deepseek
MIT
1,423.02209
1,416.618363
1,429.425816
10.675038
11,923
87
overall
2026-06-25
deepseek-r1-0528
deepseek
MIT
1,422.171106
1,416.514287
1,427.827924
8.330064
18,463
88
overall
2026-06-25
grok-4-fast-chat
xai
Proprietary
1,420.821669
1,413.20105
1,428.442289
15.117652
6,810
89
overall
2026-06-25
nvidia-nemotron-3-ultra-550b-a55b-nvfp4
nvidia
OpenMDW-1.1
1,420.455577
1,412.99481
1,427.916344
14.490079
7,484
90
overall
2026-06-25
ernie-5.0-preview-1022
baidu
Proprietary
1,419.123204
1,410.312644
1,427.933764
20.20742
4,704
91
overall
2026-06-25
kimi-k2-0905-preview
moonshot
Modified MIT
1,417.878178
1,411.385549
1,424.370807
10.973496
11,771
92
overall
2026-06-25
deepseek-v3.1-terminus-thinking
deepseek
MIT
1,417.551815
1,407.577642
1,427.525989
25.897488
3,461
93
overall
2026-06-25
kimi-k2-0711-preview
moonshot
Modified MIT
1,417.462692
1,412.604723
1,422.32066
6.143461
27,629
94
overall
2026-06-25
deepseek-v3.1
deepseek
MIT
1,417.394343
1,411.384932
1,423.403754
9.400861
14,957
95
overall
2026-06-25
qwen3.5-122b-a10b
alibaba
Apache 2.0
1,417.270353
1,412.868127
1,421.672579
5.044853
28,566
96
overall
2026-06-25
deepseek-v3.1-thinking
deepseek
MIT
1,417.153522
1,410.544325
1,423.76272
11.371069
11,730
97
overall
2026-06-25
minimax-m2.7
minimax
Modified MIT
1,416.652504
1,412.258019
1,421.046989
5.027126
39,341
98
overall
2026-06-25
deepseek-v3.1-terminus
deepseek
MIT
1,416.068486
1,406.451144
1,425.685829
24.077645
3,700
99
overall
2026-06-25
amazon-nova-experimental-chat-26-01-10
amazon
Proprietary
1,416.028468
1,406.127414
1,425.929521
25.519175
3,407
100
overall
2026-06-25
End of preview.

Arena Leaderboard Dataset

Historical snapshots of the Arena leaderboard.

Usage

from datasets import load_dataset

# Load all historical text style control data
ds = load_dataset("lmarena-ai/leaderboard-dataset", "text_style_control", split="full")

# Load the current text style control leaderboard
ds = load_dataset("lmarena-ai/leaderboard-dataset", "text_style_control", split="latest")

# Filter to overall category
ds = load_dataset("lmarena-ai/leaderboard-dataset", "text_style_control", split="latest",
 filters=[("category", "==", "overall")]
)

# Track a specific model's rating over time
ds = load_dataset("lmarena-ai/leaderboard-dataset", "text_style_control", split="full",
 filters=[("category", "==", "overall"), ("model_name", "==", "gpt-4o-2024-05-13")],
 columns=["model_name", "rating", "rank", "leaderboard_publish_date"]
)

# Load raw (non-style-controlled) text ratings
ds = load_dataset("lmarena-ai/leaderboard-dataset", "text", split="full")

Subsets

Each arena is a separate subset, arenas with style control have an additional subset with a _style_control suffix .

  • text, text_style_control
  • vision, vision_style_control
  • search, search_style_control
  • document, document_style_control
  • webdev (Code Arena)
  • text_to_image
  • image_edit
  • text_to_video
  • image_to_video
  • video_edit

Splits

  • full: All historically published leaderboards
  • latest: Only the most recently published leaderboards

Notes

  • On January 9, 2024, the rating system was updated from Elo to Bradley-Terry.
  • On May 16, 2025, style control was made the default for text and vision arenas and the offset was adjusted to put the style control leaderboard on the same rating scale as non style control.
  • On July 23, 2025, frequency-based re-weighting was implemented.
  • Search and Webdev Arena data starts from their releases on the arena.ai (then lmarena.ai) on August 7 and November 12 2025 respectively.

For a complete record of all leaderboard methodology changes, see the Leaderboard Changelog.

Schema

Column Type Description
model_name string Model identifier
organization string Model creator/organization
license string Model license
rating float Arena Score
rating_lower float Lower confidence bound
rating_upper float Upper confidence bound
variance float Rating variance
vote_count int Number of battles for this model
rank int Rank within this leaderboard
category string Leaderboard category (e.g., overall, coding, math)
leaderboard_publish_date string Date that this score was published (YYYY-MM-DD)
Downloads last month
9,907