👁 Wikipedia Scraper | $5 / 1k | Fast & Reliable avatar

Wikipedia Scraper | $5 / 1k | Fast & Reliable

Pricing

$4.99 / 1,000 results

👁 Wikipedia Scraper | $5 / 1k | Fast & Reliable

Wikipedia Scraper | $5 / 1k | Fast & Reliable

Get full articles and detailed search results with the Wikipedia Scraper. Extract structured data including titles, summaries, citations, and full content. Ideal for market research, AI training, and competitive intelligence.

Pricing

$4.99 / 1,000 results

Rating

0.0

(0)

Developer

👁 Fatih Tahta

Fatih Tahta

Maintained by Community

Actor stats

Bookmarked

Total users

Monthly active users

7 days ago

Last modified

Overview

Wikipedia is the world’s most comprehensive open encyclopedia, constantly updated across thousands of topics and languages. The Wikipedia Scraper automates the collection of article-level insights, transforming public encyclopedia pages into structured datasets ready for analysis.

The actor reliably gathers:

Complete article content with metadata such as titles, summaries, and publication details.
Reference counts, internal links, media assets, and infobox attributes for deeper context.
Search and category results expanded into the full articles they reference.

Run it once and receive consistent, ready-to-use records—no manual browsing, copying, or formatting required.

Why Use This Actor

Market researchers & analysts: Track company histories, industry timelines, and competitive narratives straight from a trusted knowledge base.
Developers & data teams: Feed LLM training pipelines, knowledge graphs, or semantic search indices with normalized Wikipedia data.
Content strategists & educators: Assemble curated reading lists, bibliographies, or citation-rich briefings without handcrafting each entry.
Knowledge operations & directory builders: Populate internal wikis, catalogues, or monitoring dashboards with up-to-date encyclopedia coverage.

Use it for lead and partner research, market landscaping, product discovery, directory building, due diligence prep, and any workflow that benefits from detailed, cited background information.

Input Parameters

Parameter	Type	Description	Default
`articleInputs`	array of strings	Provide Wikipedia article slugs or full URLs to fetch directly.	—
`searchInputs`	array of strings	Enter search queries or Wikipedia search result URLs to discover matching articles before scraping them.	—
`language`	string (select)	Choose the Wikipedia language edition that pairs with the provided slugs and targets.	`"en"`
`limit`	integer	Maximum number of articles saved per input. Useful for sampling or capping run size.	`50000`
`proxyConfiguration`	object	Configure the connection settings. The default Apify datacenter proxy keeps runs stable.	Apify datacenter proxy

Example Input

{
"articleInputs":[
"YouTube",
"https://en.wikipedia.org/wiki/OpenAI"
],
"searchInputs":[
"generative AI",
"https://en.wikipedia.org/w/index.php?search=cloud%20computing&title=Special:Search&fulltext=1"
],
"language":"en",
"limit":250,
"proxyConfiguration":{
"useApifyProxy":true
}
}

Example Output

{
"title":"YouTube",
"pageId":3524766,
"language":"en",
"url":"https://en.wikipedia.org/wiki/YouTube",
"referencesCount":409,
"internalLinks":[
"https://en.wikipedia.org/wiki/Online_video_platform",
"https://en.wikipedia.org/wiki/Alphabet_Inc.",
"https://en.wikipedia.org/wiki/Social_media_platform"
],
"imageUrls":[
"https://upload.wikimedia.org/wikipedia/commons/thumb/2/20/YouTube_2024.svg/330px-YouTube_2024.svg.png"
],
"infobox":{
"Type of business":"Subsidiary",
"Founded":"February 14, 2005",
"Headquarters":"San Bruno, California, United States",
"Owner":"Alphabet Inc."
},
"mainContent":"YouTube is an American online video sharing platform owned by Google...",
"fetchedAt":"2025-11-05T10:11:18.247Z"
}

Field highlights

title, pageId, language, and url identify the article.
referencesCount, internalLinks, and imageUrls show sourcing depth and media assets.
infobox compiles structured summary facts.
mainContent delivers the full article body for text analysis or summarization.
fetchedAt records when the data was collected.

Notes & Limitations

Wikipedia content changes frequently; schedule runs to keep datasets current.
Always review and respect Wikipedia’s licensing terms and robots guidelines when redistributing or republishing material.
Use the data responsibly, especially when combining it with other datasets or personal information.

Support

Questions or custom needs? Open an issue on the Issues tab of the actor page in Apify Console and it will be resolved around the clock.

Happy Scraping,

Fatih

👁 Grokipedia Scraper | $2.5 / 1k | Fast & Reliable avatar

Grokipedia Scraper | $2.5 / 1k | Fast & Reliable

fatihtahta/grokipedia-scraper

Get full articles and detailed search results with the Grokipedia Scraper. Extract structured data including titles, summaries, citations, and full content. Ideal for market research, AI training, and competitive intelligence.

👁 User avatar

Fatih Tahta

👁 Wikipedia Scraper avatar

Wikipedia Scraper

automation-lab/wikipedia-scraper

Search and extract Wikipedia articles — titles, summaries, full content, categories, and images. Uses the free MediaWiki API.

👁 User avatar

Stas Persiianenko

👁 Wikipedia Article Scraper avatar

Wikipedia Article Scraper

crawlerbros/wikipedia-scraper

Extract structured data from Wikipedia articles. Get summaries, categories, images, metadata, and descriptions using Wikipedia's official API. Supports 300+ languages.

👁 User avatar

Crawler Bros

Wikipedia Scraper

velvety_bedbug/wikipedia-scraper

Search Wikipedia articles, fetch article content and summaries, or get today's featured and most-read articles. Supports all Wikipedia language editions.

👁 User avatar

Peters Bugs

Wikipedia Data Extractor - Articles & Summaries

vernacular_reservoir/wikipedia-data-extractor

Extract structured data from Wikipedia articles by topic or keyword. Get title, summary, description, thumbnail, coordinates and related links. Supports all Wikipedia languages. No API key required.

👁 User avatar

Aleksandrs

Wikipedia Article Extractor

glassventures/wikipedia-article-extractor

Extract Wikipedia articles via MediaWiki API. Get full text, summaries, sections, categories, images, links. Multi-language. Perfect for AI/ML training data and RAG.

👁 User avatar

Glass Ventures

👁 Wikipedia Page Dataset Scraper avatar

Wikipedia Page Dataset Scraper

scrapeai/wikipedia-page-dataset-scraper

Scrape Wikipedia articles and export structured dataset fields for training, knowledge bases, and research.

👁 User avatar

ScrapeAI

5.0

Wikipedia Scraper - Article Content Extractor

lulzasaur/wikipedia-scraper

Scrape Wikipedia articles. Search by topic and extract full structured content: summaries, sections, infobox data, categories, references, images, and edit history for any article.

👁 User avatar

lulz bot

👁 Wikipedia MCP Server avatar

Wikipedia MCP Server

agentify/wikipedia-mcp-server

MCP server for Wikipedia, providing LLMs and clients with real-time access to Wikipedia articles, summaries, sections, and related information via Apify Actor.

👁 User avatar

agentify

Wikipedia Article Scraper

cloud9_ai/wikipedia-scraper

Scrape Wikipedia articles by search keyword or exact title. Returns summaries, full article text, categories, and links. Supports 300+ languages.

👁 User avatar

cloud9

URL: https://apify.com/fatihtahta/wikipedia-scraper

⇱ Wikipedia Scraper | $5 / 1k | Fast & Reliable · Apify

Wikipedia Scraper | $5 / 1k | Fast & Reliable

Overview

Why Use This Actor

Input Parameters

Example Input

Example Output

Notes & Limitations

Support

You might also like

Grokipedia Scraper | $2.5 / 1k | Fast & Reliable

Wikipedia Scraper

Wikipedia Article Scraper

Wikipedia Scraper

Wikipedia Data Extractor - Articles & Summaries

Wikipedia Article Extractor

Wikipedia Page Dataset Scraper

Wikipedia Scraper - Article Content Extractor

Wikipedia MCP Server

Wikipedia Article Scraper