![]() |
VOOZH | about |
Live performance rankings for AI shopping agents. Compare checkout rates, token efficiency, and speed across real UCP-enabled stores. See detailed model profiles or run your own benchmark.
| # | Model | Provider | Search % | Cart % | Checkout % | Avg Tokens | Avg Duration |
|---|---|---|---|---|---|---|---|
| 1 | Claude Sonnet 4.6 | 👁 Anthropic Anthropic | 83% | 60% | 46.3% | 73,956 | 36.4s |
| 2 | Gemini 3.5 Flash | 👁 Google | 79% | 58.5% | 43.1% | 64,474 | 20.2s |
| 3 | Llama 3.3 70BRetired | 👁 Meta Meta | 82.1% | 52.7% | 39.3% | 58,192 | 40.4s |
| 4 | Gemini 2.5 Flash | 👁 Google | 76.6% | 43.9% | 33.2% | 33,433 | 13.6s |
| 5 | DeepSeek V3.2 | 👁 DeepSeek DeepSeek | 75.3% | 38.2% | 32.6% | 46,986 | 49.8s |
| 6 | Gemini 3.1 Pro | 👁 Google | 76.7% | 44% | 28.7% | 50,838 | 44.9s |
| 7 | Gemini 2.5 Pro | 👁 Google | 77.9% | 45.7% | 27.9% | 41,367 | 40.1s |
| 8 | Grok 4.3 | 👁 xAI xAI | 70.1% | 33.3% | 27.6% | 28,300 | 53.6s |
| 9 | Claude Opus 4.8 | 👁 Anthropic Anthropic | 59.9% | 34.6% | 26.8% | 51,574 | 31.1s |
| 10 | GPT-4o | 👁 OpenAI OpenAI | 77.3% | 34.8% | 22% | 40,608 | 19.9s |
| 11 | GPT-5.5 | 👁 OpenAI OpenAI | 65.7% | 25.2% | 16.8% | 64,469 | 42.7s |
| 12 | o4-mini (Reasoning) | 👁 OpenAI OpenAI | 73.2% | 28.6% | 16.1% | 57,157 | 38.6s |
| 13 | Grok 3 Mini (Reasoning)Retired | 👁 xAI xAI | 48.9% | 13.3% | 11.1% | 35,632 | 40.6s |
| 14 | DeepSeek R1 (Reasoning) | 👁 DeepSeek DeepSeek | 61.8% | 20.6% | 8.8% | 22,880 | 55.0s |
| 15 | DeepSeek V4 Pro | 👁 DeepSeek DeepSeek | 85.7% | 38.1% | 4.8% | 132,037 | 78.0s |
| 16 | Llama 4 Maverick | 👁 Meta Meta | 25% | 0% | 0% | 61,746 | 12.1s |
| 17 | QwQ 32B (Reasoning)Retired | Alibaba | 28.6% | 7.1% | 0% | 14,586 | 36.7s |