VOOZH about

URL: https://apify.com/wetyr_corporation/wikipedia-pro-scraper

โ‡ฑ Wikipedia Pro Scraper - Sections + Infobox ยท Apify


๐Ÿ‘ Wikipedia Pro Scraper - Sections, Infobox, References avatar

Wikipedia Pro Scraper - Sections, Infobox, References

Pricing

Pay per event

Go to Apify Store

Wikipedia Pro Scraper - Sections, Infobox, References

Wikipedia scraper for AI/RAG. Extracts structured sections, infobox key-value data, references. Multilingual, batch-friendly. Ready for vector databases.

Pricing

Pay per event

Rating

0.0

(0)

Developer

๐Ÿ‘ WETYR

WETYR

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

2 months ago

Last modified

Share

Wikipedia Pro Scraper

The Wikipedia scraper built for AI/RAG pipelines. Extracts:

  • Structured sections (Introduction, History, etc) ready for chunking
  • Infobox key-value data (birthdate, country, founder, etc) โ€” perfect for fact extraction
  • References with full citation text
  • Internal wikilinks for knowledge graph building
  • Image URLs for multimodal datasets
  • Categories + langlinks for cross-language enrichment

Multilingual (300+ Wikipedia editions). Properly attributed under CC BY-SA.

Why Pro vs other Wikipedia scrapers

Most Wikipedia scrapers on Apify return raw HTML or stripped plaintext. Ours gives you:

  • Sectioned output for vector DB chunking
  • Parsed infobox (no manual table parsing)
  • Clean text with citation markers removed
  • Wikilinks graph for knowledge graphs

Pricing

  • $0.05 per actor start
  • $0.01 per article scraped
  • $0.003 per infobox parsed

Typical run: 1,000 articles with infoboxes = ~$13.05.

You might also like

Wikipedia Article Scraper

rupom888/wikipedia-article-scraper

Scrape Wikipedia articles using the official MediaWiki REST API. Search by keyword, look up specific titles, or scrape by URL. Extracts full article text, sections, infobox data, categories, references, images, and related articles. Supports 300+ languages.

Wikipedia Scraper

devilscrapes/wikipedia-article-scraper

Extract Wikipedia article text, summary, infobox, references, and categories via the Wikipedia API โ€” one row per article, in any language โ€” export to JSON or CSV. We handle title normalisation, redirects, retries, and rate-limit pacing so your dataset arrives clean.

Wikipedia MCP Server

agentify/wikipedia-mcp-server

MCP server for Wikipedia, providing LLMs and clients with real-time access to Wikipedia articles, summaries, sections, and related information via Apify Actor.

Wikipedia Article Scraper

crawlerbros/wikipedia-scraper

Extract structured data from Wikipedia articles. Get summaries, categories, images, metadata, and descriptions using Wikipedia's official API. Supports 300+ languages.