Website Content Crawler

Pricing

from $0.01 / 1,000 results

Website Content Crawler

Crawl websites for SEO audits. Extracts HTML, title, meta tags, headings, links, & text content from pages. Automatic sitemap detection & parsing Extracts metadata (title, description, OG tags) Heading structure (H1, H2, H3) Internal & external link analysis Image extraction w/alt text Word count

Pricing

from $0.01 / 1,000 results

Rating

0.0

(0)

Developer

👁 The Howlers

The Howlers

Maintained by Community

Actor stats

Bookmarked

114

Total users

Monthly active users

90 days

Issues response

2 months ago

Last modified

Website Crawler - SEO Audit Crawler with Markdown Extraction

Fast, reliable website crawler built for SEO audits and AI/LLM content analysis. Auto-discovers sitemaps, extracts metadata, headings, and LLM-ready markdown content from every page.

BYOK (Bring Your Own Key) -- you provide your own API credentials.

Before You Start

This actor requires your own API credentials to fetch real data.

Where to get your key: Your Firecrawl API key for web crawling. When provided, Firecrawl is used as the primary crawler (faster, handles JS rendering). Falls back to Playwright if not provided or if Firecrawl fails.

You can test with Demo Mode first (free, no key needed) to see the output format before committing.

Quick Start

Test with Demo Mode (free, no API key needed)

{
"demoMode":true,
"startUrls":"https://example.com"
}

Run with real data

{
"demoMode":false,
"startUrls":"https://example.com",
"maxCrawlPages":25,
"maxCrawlDepth":2,
"crawlSitemap":true,
"firecrawlApiKey":"YOUR_API_KEY_HERE",
"useFirecrawl":true
}

Input Parameters

Parameter	Type	Default	Required	Description
`startUrls`	array	-	No	URLs to start crawling from
`maxCrawlPages`	integer	`25`	No	Maximum number of pages to crawl
`maxCrawlDepth`	integer	`2`	No	Maximum link depth to follow
`crawlSitemap`	boolean	`true`	No	Try to find and use sitemap for URL discovery
`firecrawlApiKey`	string	-	Yes*	Your Firecrawl API key for web crawling. When provided, Firecrawl is used as the primary crawler (faster, handles JS rendering). Falls back to Playwright if not provided or if Firecrawl fails.
`useFirecrawl`	boolean	`true`	No	Use Firecrawl as the primary crawling method. Requires Firecrawl API key. Falls back to Playwright if disabled or if Firecrawl fails.
`demoMode`	boolean	`false`	No	Run in demo mode without real credentials. Returns mock success response for testing.
`webhookUrl`	string	-	No	URL to POST results when scraping completes (Zapier, Make, n8n, custom endpoint)

*Required when Demo Mode is off.

Pricing

This actor uses pay-per-event billing:

Event	Description	Price
Page Crawled	Per page crawled with full HTML, text, and markdown extraction	$0.05
Sitemap Discovered	Per sitemap discovered and parsed for URL extraction	$0.05

Demo mode is free -- no charges for sample data.

Troubleshooting

"API key is required"

You have Demo Mode turned off but didn't provide an API key. Either:

Turn Demo Mode on to test with sample data
Add your API key in the input

"API error 403" or "Unauthorized"

Your API key is invalid, expired, or doesn't have access to this specific API endpoint. Double-check your key and account permissions.

"API error 429" or "Rate limit"

Too many requests. Wait a minute and try again, or reduce the number of items per run.

No results or empty dataset

Check the run log for error messages. Common causes:

Invalid input format (check the examples above)
API key without proper permissions
The target data doesn't exist or is too small to track

How do I test without an API key?

Enable Demo Mode in the input. This returns realistic sample data so you can verify the output format works for your workflow.

Built by John Rippy | Actor Arsenal

Website Analyzer Crawler

quarterly_lettuce/website-analyzer-crawler

A powerful web crawler that analyzes websites and extracts comprehensive SEO data including meta tags, headings structure, word count, internal/external links, and images.

👁 User avatar

Abhishek Kumar Giri

👁 Bulk SEO Data Extractor avatar

Bulk SEO Data Extractor

thirdwatch/seo-data-extractor

Extract every on-page SEO signal from any URL: title, meta tags, canonical, OG/Twitter cards, JSON-LD schema, heading hierarchy, alt-text gaps, internal/external link counts, word count, text-to-HTML ratio.

👁 User avatar

Thirdwatch

👁 Universal Website Meta Scraper — SEO & Links Analysis avatar

Universal Website Meta Scraper — SEO & Links Analysis

scrapepilot/universal-website-meta-scraper----seo-links-analysis

Extract meta data from any website instantly. Get title, description, headings, links, images, OG tags & status code. Perfect for SEO analysis, lead gen, and auditing. No coding required.

👁 User avatar

Scrape Pilot

👁 Website Content Miner avatar

Website Content Miner

techionik9993/website-content-miner

Extract clean website content at scale: page titles, meta descriptions, H1-H3 headings, readable main text, and URLs. Includes smart noise removal, Readability fallback, optional internal crawling, and structured output for SEO audits, AI datasets, research, and automation.

👁 User avatar

Techionik

5.0

👁 Heading Structure Checker avatar

Heading Structure Checker

automation-lab/heading-structure-checker

This actor analyzes the heading structure (H1-H6) of web pages. It extracts all headings in document order, checks for missing H1, multiple H1s, skipped heading levels, heading order jumps, empty headings, and overly long headings. Essential for SEO and accessibility.

👁 User avatar

Stas Persiianenko

Meta Tags Extractor

krawlify/meta-tags-extractor

Extract SEO meta tags, Open Graph, Twitter Cards, JSON-LD structured data, and headings from any website. Perfect for SEO analysis, competitor research, and content audits.

👁 User avatar

Krawlify Krawlify

Website Content Crawler Scraper

oneary/website-content-crawler

🌐 Full website crawler that extracts structured content (text, headings, metadata, links, images) from any domain. Free platform compute pricing.

👁 User avatar

Luan M.

Website Crawler

elcon/website-crawler

Crawls a website starting from one or more URLs and extracts the title, meta description, headings and text from each page.

👁 User avatar

elcon software

👁 Website Title & Heading Quality Checker avatar

Website Title & Heading Quality Checker

gr_59017/website-title-heading-quality-checker

Analyzes website title tags and heading structure (H1–H6) to evaluate SEO quality, content hierarchy, and best practices. Detects issues like missing or multiple H1s, improper heading order, and suboptimal title length, and returns a quality score with suggestions.

👁 User avatar

Gautam Rana

5.0

👁 Website SEO Audit - On-Page Analyzer, Meta, Speed & Issues avatar

Website SEO Audit - On-Page Analyzer, Meta, Speed & Issues

santhej/website-seo-audit

Instant on-page SEO audit for any list of URLs: title & meta tags, H1s, word count, internal/external links, load time, on-page score & a prioritized list of SEO issues. Bulk-check pages. Clean JSON/CSV for audits & reports. No API keys.

👁 User avatar

Santhej Kallada

5.0

👁 Blog article image

What is a vector database?

URL: https://apify.com/alizarin_refrigerator-owner/website-crawler

⇱ Website Content Crawler · Apify

Website Content Crawler

Website Crawler - SEO Audit Crawler with Markdown Extraction

Before You Start

Quick Start

Test with Demo Mode (free, no API key needed)

Run with real data

Input Parameters

Pricing

Troubleshooting

"API key is required"

"API error 403" or "Unauthorized"

"API error 429" or "Rate limit"

No results or empty dataset

How do I test without an API key?

You might also like

Website Analyzer Crawler

Bulk SEO Data Extractor

Universal Website Meta Scraper — SEO & Links Analysis

Website Content Miner

Heading Structure Checker

Meta Tags Extractor

Website Content Crawler Scraper

Website Crawler

Website Title & Heading Quality Checker

Website SEO Audit - On-Page Analyzer, Meta, Speed & Issues

Related articles