VOOZH about

URL: https://apify.com/jungle_synthesizer/artificialanalysis-ai-model-benchmark-scraper

⇱ Artificial Analysis AI Model Benchmark Scraper Β· Apify


πŸ‘ Artificial Analysis AI Model Benchmark Scraper avatar

Artificial Analysis AI Model Benchmark Scraper

Pricing

Pay per event

Go to Apify Store

Artificial Analysis AI Model Benchmark Scraper

Scrapes LLM benchmark scores, pricing, and performance data from Artificial Analysis β€” the leading independent evaluator of AI models.

Pricing

Pay per event

Rating

0.0

(0)

Developer

πŸ‘ BowTiedRaccoon

BowTiedRaccoon

Maintained by Community

Actor stats

0

Bookmarked

3

Total users

1

Monthly active users

19 days ago

Last modified

Share

Scrapes LLM benchmark scores, pricing, and performance data from Artificial Analysis β€” the leading independent evaluator of AI models.

What this actor does

Extracts structured data for ~370 AI language models from Artificial Analysis, including:

  • Benchmark scores: Quality index, MMLU-Pro, GPQA Diamond, HumanEval, LiveCodeBench, MATH-500, MMMU-Pro, and more
  • Pricing: Input, output, and blended cost per million tokens
  • Performance: Median throughput (tokens/sec) and time-to-first-token latency
  • Provider info: All hosting providers, cheapest provider by blended price
  • Model metadata: Creator/lab, release date, parameter count, context window, license, open-weight status

All data is extracted in a single request to the /models page, which serves the full model dataset inline as a React Server Component payload. No per-model crawling needed.

Use cases

  • Model selection: Compare cost-vs-quality trade-offs across providers
  • Price monitoring: Track pricing changes across OpenAI, Anthropic, Google, Meta, and 40+ hosting providers
  • Research and benchmarking: Import baseline scores into your own evaluation pipeline
  • Cost optimization: Find the cheapest or fastest provider for a given quality target

Input

FieldTypeRequiredDefaultDescription
maxItemsintegerYes10Maximum number of model records to return. Set to a large number (e.g. 500) to retrieve all models.

Output

Each dataset item represents one AI model. Example record:

{
"model_slug":"claude-4-opus",
"model_name":"Claude 4 Opus",
"provider":"Anthropic",
"release_date":"2025-05-22",
"parameter_count":null,
"context_window_tokens":200000,
"aa_quality_index":57.4,
"mmlu_pro_score":0.812,
"gpqa_diamond_score":0.738,
"humaneval_score":0.921,
"math_score":84.1,
"chatbot_arena_elo":null,
"aider_polyglot_score":null,
"livecodebench_score":0.703,
"mmmu_score":null,
"benchmark_breakdown":"{\"agentic_index\":45.2,\"coding_index\":68.1,...}",
"price_input_usd_per_million":15,
"price_output_usd_per_million":75,
"price_blended_usd_per_million":30,
"throughput_tokens_per_second":58.3,
"latency_first_token_ms":1204,
"hosting_providers":"[\"Anthropic\",\"Amazon Bedrock\",\"Google Vertex AI\"]",
"cheapest_provider":"Amazon Bedrock",
"fastest_provider":null,
"license":"proprietary",
"is_open_weight":false,
"profile_url":"https://artificialanalysis.ai/models/claude-4-opus",
"scraped_at":"2026-05-31T08:00:00.000Z"
}

Notes on specific fields:

  • chatbot_arena_elo and aider_polyglot_score are always null β€” these metrics are not tracked by Artificial Analysis and would require separate scrapers from Chatbot Arena and Aider.chat.
  • benchmark_breakdown is a JSON string containing additional sub-benchmarks (agentic_index, coding_index, math_index, HLE, AIME-2025, IFBench, SciCode, LCR, Omniscience).
  • hosting_providers is a JSON string array of all providers offering this model.
  • fastest_provider is always null β€” per-provider throughput breakdown is not available on the listing page.

Notes

  • The actor makes a single HTTP request to https://artificialanalysis.ai/models. No proxy required.
  • The full dataset (~370 models) is available in one request. Use maxItems: 500 to get everything.
  • Prices and benchmarks on Artificial Analysis update frequently β€” run the actor periodically for up-to-date data.

You might also like

Artificial Intelligence News

visita/artificial-intelligence-news

Transform the overwhelming flood of artificial intelligence news into precise, actionable intelligence. This actor monitors 25+ premier AI research blogs and news feeds, using advanced LLM analysis to extract model updates, benchmarks, and industry-defining shifts.

πŸ‘ User avatar

Visita Intelligence

18

AI Model Tracker β€” LLM Benchmarks & Pricing

aurumworks/ai-model-tracker

Track AI model benchmarks, pricing, and performance. Get rankings, speed metrics, cost per token, and benchmark scores for 500+ LLMs from OpenAI, Anthropic, Google, Meta, and more. Updated weekly.

Benchmark International Business Listing Scraper πŸ’πŸ“ˆπŸ“Š

scrapestorm/benchmark-international-business-listing-scraper

πŸ”Ž Easily collect Benchmark International listings by providing one or multiple Benchmark search URLs Extract business insights such as 🏒 Business Description 🏭 Industry πŸ“ Location πŸ’° Revenue πŸ‘€ Contact Name πŸ“§ Email πŸ“ž Phone & more Perfect for M&A deal sourcing & business opportunity discovery

2

5.0

Benchmark International Business Scraper - Low-costπŸ’²πŸ”₯πŸ’πŸ“ˆ

delectable_incubator/benchmark-international-business-scraper---low-cost

Scrape Benchmark International listings πŸ”ŽπŸ’ with a powerful business intelligence scraper. Extract descriptions, industries, locations, revenue, contacts, emails, phones, and more from search URLs. Ideal for M&A deal sourcing, lead generation, and business opportunity discovery πŸ“ŠπŸš€

Apify Store Scraper

shahidirfan/Apify-Store-Scraper

Scrape Apify Store actor data including titles, descriptions, pricing, reviews, and usage stats. Perfect for marketplace analysis, competitive research, and building actor intelligence databases. Monitor trends and benchmark competitors.

GEO Competitive Benchmark β€” AI Search Readiness vs Competitors

foxlabs/geo-benchmark

Benchmark your site vs up to 5 competitors for AI-search readiness in one run. See where rivals out-rank you for ChatGPT, Perplexity, Gemini & Claude citation β€” AI-crawler access, schema, extractability, speed, trust β€” as a ranked scoreboard + per-signal gap list. Deterministic GEO/AEO.

2

Sentiment and Subject / Topic Analysis

ai_founder/sentiment-and-subject-topic-analysis

Artificial Intelligence divides the text into sentences and analyzes the topic, subtopic and sentiment for each sentence.

Llm Response Evaluator

fiery_dream/llm-response-evaluator

Evaluate LLM outputs with comprehensive quality metrics and A/B testing capabilities. Free alternative to Confident AI ($99/mo).

πŸ‘ User avatar

Cody Churchwell

2

Related articles

What is generative AI?
Read more
How to improve AI models with web scraping and data augmentation
Read more
Mastering AI for data analysis: a comprehensive guide
Read more