VOOZH about

URL: https://apify.com/gentle_cloud/sitemap-generator

โ‡ฑ Sitemap Generator ยท Apify


Pricing

from $0.01 / actor start

Go to Apify Store

Crawl websites and generate XML sitemaps with configurable depth and page limits. Discover all pages, extract metadata, and output a ready-to-use sitemap.xml.

Pricing

from $0.01 / actor start

Rating

0.0

(0)

Developer

๐Ÿ‘ Monkey Coder

Monkey Coder

Maintained by Community

Actor stats

1

Bookmarked

5

Total users

0

Monthly active users

3 months ago

Last modified

Categories

Share

Generate XML sitemaps by crawling websites with configurable depth and page limits.

What it does

Sitemap Generator starts from one or more URLs, crawls internal links using breadth-first traversal, and produces:

  • Per-page crawl metadata in the Apify dataset
  • A complete XML sitemap string for each start URL

This actor is designed for SEO discovery, site inventory checks, and quick sitemap generation from live websites.

Features

  • Crawls internal links only (same domain/subdomain family)
  • Breadth-first traversal with max_depth and max_pages
  • Handles relative URLs, fragments, query strings, redirects, and timeouts
  • Skips common non-HTML/static resources (images, CSS, JS, PDFs, archives, media)
  • Extracts page title, approximate word count, link counts, and HTTP metadata
  • Outputs XML sitemap in standard urlset format

How to use

  1. Provide one or more Start URLs.
  2. Set Maximum Crawl Depth (default 3).
  3. Set Maximum Pages per start URL (default 100).
  4. Run the actor.

The actor writes one dataset item per discovered page with crawl metrics. For each start URL, the first dataset item includes sitemap_xml for all pages discovered in that crawl.

Input

  • start_urls (array, requestListSources editor)
  • max_depth (integer, default 3)
  • max_pages (integer, default 100)

Sample output JSON

{
"url":"https://example.com/docs",
"depth":1,
"status_code":200,
"content_type":"text/html; charset=utf-8",
"title":"Documentation | Example",
"last_modified":"Tue, 12 Mar 2024 09:12:11 GMT",
"word_count":842,
"internal_links_count":34,
"external_links_count":6,
"sitemap_xml":null,
"total_pages_found":57,
"crawl_started_at":"2026-03-18T12:34:56.000000+00:00"
}

Example first item for a crawl includes sitemap_xml:

<?xml version="1.0" encoding="UTF-8"?>
<urlsetxmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
<url>
<loc>https://example.com</loc>
<lastmod>2026-03-18</lastmod>
<priority>1.0</priority>
</url>
</urlset>

Notes about limits

  • max_depth and max_pages are safety limits; higher values increase run time and request volume.
  • Only HTTP/HTTPS pages are crawled.
  • Some sites block crawlers or require JavaScript rendering; this actor performs pure HTTP crawling.
  • word_count is an approximation derived from visible page text.

You might also like

Sitemap Generator

alizarin_refrigerator-owner/sitemap-generator

Generate XML sitemaps by crawling any website. Discover all pages, images, & videos with configurable crawl depth, URL filters, & multiple output formats. Full Site Crawling ,Image Sitemap, Video Sitemap, Multiple Output Formats, URL Filtering, Configurable Depth, Last Modified, Webhook Integration

Sitemap Generator

datawinder/sitemap-generator

Automatically crawl a website and generate an SEO-ready sitemap in XML, HTML, or TXT format. Supports crawl depth limits, URL include/exclude patterns, and optional merging with an existing sitemap.xml. Ideal for SEO audits, site migrations, and automation.

๐Ÿ‘ User avatar

DatawinderLabs

2

Sitemap URL Extractor - List All URLs in a Sitemap

dltik/sitemap-url-extractor

Extract every URL from any XML sitemap, with lastmod, changefreq and priority. Resolves sitemap indexes recursively. Pass a sitemap.xml or just a site root to auto-discover its sitemaps. Pure HTTP, no browser โ€” fast and cheap.

Sitemap Generator

himalyancoder/Sitemap-generator

Sitemap Finder & URL Extractor ยท Crawl Any XML Sitemap

corent1robert/sitemap-detector

Find and crawl XML sitemaps from any website. Follows sitemap indexes, handles gzip, and exports every page URL with source file and lastmod into a clean dataset. No config needed.

๐Ÿ‘ User avatar

Corentin Robert

3

Sitemap Generator - Creates sitemap.xml for any domain

wisteria_banjo/sitemap-generator---creates-sitemap-xml-for-any-domain

Generate a clean, standards-compliant sitemap.xml for a website. This actor crawls a single website, discovers all indexable pages, and produces: โœ… A ready-to-submit sitemap.xml (Google-compliant) โœ… A structured JSON dataset of discovered URLs (for auditing, reporting, and billing)

13

Sitemap URL Extractor

onescales/sitemap-url-extractor

Provide a website link to a sitemap.xml and the app will extract and list all URLs in the sitemap as well as additional data in the sitemap (i.e. https://onescales.com/sitemap.xml).

568

5.0