VOOZH about

URL: https://apify.com/predictable_function/my-actor

⇱ Sitemap Health Validator Β· Apify


Pricing

Pay per usage

Go to Apify Store

Sitemap Health Validator

Validates sitemap.xml files and checks health of listed URLs

Pricing

Pay per usage

Rating

5.0

(2)

Developer

πŸ‘ riya rawat

riya rawat

Maintained by Community

Actor stats

1

Bookmarked

66

Total users

1

Monthly active users

5 months ago

Last modified

Categories

Share

JavaScript Website Scraper (Crawlee + Cheerio)

A fast, lightweight Apify Actor for scraping static and semi-dynamic websites using Crawlee’s CheerioCrawler. The Actor extracts page titles and URLs from provided start pages and stores the results in an Apify Dataset.

Designed for performance, low resource usage, and easy extensibility. Fully compliant with Apify Actor Store rules and suitable for the Apify $1 Million Actor Challenge.


Key Features

  • Fast HTML parsing using Cheerio (no browser required)
  • Crawlee-powered request handling and concurrency
  • Low memory usage (works on Apify free plan)
  • Proxy support for reduced blocking
  • Structured and consistent dataset output
  • Easy to customize and extend

Input

startUrls

Type: Array
Description: List of URLs where the crawler starts.

maxPagesPerCrawl

Type: Number
Description: Maximum number of pages to scrape.

Example Input

{
"startUrls":[
{"url":"https://example.com"}
],
"maxPagesPerCrawl":10
}

You might also like

Xml Sitemap Validator

zerobreak/xml-sitemap-validator

XML sitemap validator that crawls every URL in your sitemap and flags broken links, redirect chains, and structural errors β€” so SEO teams can audit sitemap health in seconds.

Sitemap Scraper

pvillalva/sitemap-scraper

The Sitemap Scraper extracts and outputs all URLs from a given sitemap.

πŸ‘ User avatar

Percival Villalva

268

Sitemap to URL Crawler β€” Extract Sitemap.xml URLs for RAG

logiover/sitemap-to-url-crawler

Extract all URLs from any sitemap.xml recursively. Export sitemap URLs to CSV/JSON for RAG pipelines, SEO audits, and LLM training datasets.

Sitemap URL Extractor

onescales/sitemap-url-extractor

Provide a website link to a sitemap.xml and the app will extract and list all URLs in the sitemap as well as additional data in the sitemap (i.e. https://onescales.com/sitemap.xml).

568

5.0

Sitemap URL Extractor

getdataforu/sitemap-url-extractor

Provide a website link to a sitemap.xml and the app will extract and list all URLs in the sitemap as well as additional data in the sitemap (i.e. https://onescales.com/sitemap.xml).

2

5.0

Sitemap Sniffer

maximedupre/sitemap-sniffer

Find sitemap files from website roots, domains, robots.txt, and direct sitemap URLs. Export sitemap metadata, URL counts, nested index depth, and optional URL inventory rows.

πŸ‘ User avatar

Maxime DuprΓ©

2

Sitemap URL Extractor - List All URLs in a Sitemap

dltik/sitemap-url-extractor

Extract every URL from any XML sitemap, with lastmod, changefreq and priority. Resolves sitemap indexes recursively. Pass a sitemap.xml or just a site root to auto-discover its sitemaps. Pure HTTP, no browser β€” fast and cheap.