VOOZH about

URL: https://apify.com/scrappy_garden/sitemap-generator

โ‡ฑ Sitemap Generator - Crawl Website & Create XML Sitemap ยท Apify


๐Ÿ‘ Sitemap Generator - Crawl Website & Create XML Sitemap avatar

Sitemap Generator - Crawl Website & Create XML Sitemap

Pricing

$4.99/month + usage

Go to Apify Store

Sitemap Generator - Crawl Website & Create XML Sitemap

Generate an XML sitemap for any website. Crawls internal pages from start URLs (with depth + page limits), deduplicates URLs, and stores a ready-to-submit sitemap.xml plus a structured dataset and summary for SEO audits.

Pricing

$4.99/month + usage

Rating

0.0

(0)

Developer

๐Ÿ‘ Bikram Adhikari

Bikram Adhikari

Maintained by Community

Actor stats

0

Bookmarked

11

Total users

2

Monthly active users

5 months ago

Last modified

Categories

Share

Generate an XML sitemap (sitemap.xml) for any website by crawling internal pages from one or more start URLs.

This Actor is designed for:

  • SEO audits (discover missing pages)
  • Creating/refreshing sitemaps for search engines
  • QA / monitoring of site URL coverage

What it does

  • Crawls internal links (same hostname as the provided start URLs)
  • Deduplicates URLs
  • Stores sitemap.xml in the default key-value store
  • If the site has more than 50,000 discovered URLs, it creates multiple sitemap-*.xml parts plus a sitemap-index.xml (and sitemap.xml will contain the index for compatibility)
  • Writes a dataset item for each crawled page (included/excluded + reason)
  • Writes a SUMMARY JSON report (counts, settings, sitemap URL count)

Input

  • startUrls (required): Start URLs (request list)
  • maxPages: Max pages to crawl (limits total requests)
  • maxDepth: Max link depth from the start URLs
  • ignoreUrlPatterns: Array of regex strings to exclude URLs
  • includeQueryParams: Include ?query=params in sitemap URLs
  • includeFragments: Include #fragments in sitemap URLs (usually disabled)
  • includeLastModified: If enabled, uses the HTTP Last-Modified header for <lastmod> when available
  • respectRobotsTxt: If enabled, skips URLs disallowed by robots.txt for User-agent: * (best-effort)
  • robotsTxtTimeoutSecs: Timeout for downloading robots.txt
  • changefreq, priority: Optional sitemap hints applied to all URLs

Output

Key-value store

  • sitemap.xml (XML)
  • sitemap-index.xml (XML, only for large sites)
  • sitemap-1.xml, sitemap-2.xml, ... (XML parts, only for large sites)
  • SUMMARY (JSON)

Dataset

Each item contains:

  • url, normalizedUrl, statusCode, contentType
  • depth, discoveredFrom
  • includedInSitemap, exclusionReason
  • lastModified, crawledAt

SEO keywords

sitemap generator, xml sitemap generator, website sitemap crawler, generate sitemap.xml, seo sitemap tool, internal link crawler

Quick start

Store page: https://apify.com/scrappy_garden/sitemap-generator

Paste this into Input and click Run:

{
"startUrls":[
{
"url":"https://example.com/"
}
],
"proxyConfiguration":{
"useApifyProxy":false
}
}

Outputs (what you get)

  • Dataset: Dataset items typically include fields like: url, statusCode, includedInSitemap, exclusionReason, depth, lastModified, crawledAt.
  • Key-value store: SUMMARY, sitemap.xml

Tips (trust + predictable results)

  • Start with 1โ€“3 URLs to validate behavior, then scale up.
  • If a target blocks requests, enable Proxy and/or slow down concurrency in Input.
  • Use the SUMMARY / REPORT keys (when present) for automation pipelines and monitoring.

Related actors

Search keywords

sitemap generator, sitemap generator - crawl website & create xml sitemap, website audit, seo, sitemap

You might also like

Sitemap Generator - Creates sitemap.xml for any domain

wisteria_banjo/sitemap-generator---creates-sitemap-xml-for-any-domain

Generate a clean, standards-compliant sitemap.xml for a website. This actor crawls a single website, discovers all indexable pages, and produces: โœ… A ready-to-submit sitemap.xml (Google-compliant) โœ… A structured JSON dataset of discovered URLs (for auditing, reporting, and billing)

13

Sitemap Generator

gentle_cloud/sitemap-generator

Crawl websites and generate XML sitemaps with configurable depth and page limits. Discover all pages, extract metadata, and output a ready-to-use sitemap.xml.

Sitemap to URL Crawler โ€” Extract Sitemap.xml URLs for RAG

logiover/sitemap-to-url-crawler

Extract all URLs from any sitemap.xml recursively. Export sitemap URLs to CSV/JSON for RAG pipelines, SEO audits, and LLM training datasets.

Sitemap Generator

datawinder/sitemap-generator

Automatically crawl a website and generate an SEO-ready sitemap in XML, HTML, or TXT format. Supports crawl depth limits, URL include/exclude patterns, and optional merging with an existing sitemap.xml. Ideal for SEO audits, site migrations, and automation.

๐Ÿ‘ User avatar

DatawinderLabs

2

Sitemap URL Extractor

onescales/sitemap-url-extractor

Provide a website link to a sitemap.xml and the app will extract and list all URLs in the sitemap as well as additional data in the sitemap (i.e. https://onescales.com/sitemap.xml).

568

5.0

Sitemap URL Extractor

getdataforu/sitemap-url-extractor

Provide a website link to a sitemap.xml and the app will extract and list all URLs in the sitemap as well as additional data in the sitemap (i.e. https://onescales.com/sitemap.xml).

2

5.0

Sitemap Scraper

pvillalva/sitemap-scraper

The Sitemap Scraper extracts and outputs all URLs from a given sitemap.

๐Ÿ‘ User avatar

Percival Villalva

268