VOOZH about

URL: https://apify.com/himalyancoder/sitemap-generator

โ‡ฑ Sitemap Generator [DEPRECATED] ยท Apify


๐Ÿ‘ Sitemap Generator avatar

Sitemap Generator

Deprecated

Pricing

from $7.00 / 1,000 results

Go to Apify Store

Pricing

from $7.00 / 1,000 results

Rating

0.0

(0)

Developer

๐Ÿ‘ Sameer Pun

Sameer Pun

Maintained by Community

Actor stats

1

Bookmarked

1

Total users

0

Monthly active users

4 months ago

Last modified

Categories

Share

Sitemap Generator Actor

Python Apify Actor that crawls a single hostname and generates sitemap files:

  • sitemap.xml (always an XML sitemap index)
  • sitemap-00001.xml, sitemap-00002.xml, ... (50,000 URLs max per chunk)
  • sitemap.html (optional)
  • sitemap.txt (optional)
  • sitemap-summary.json (run summary and output key references)

The Actor respects robots.txt, includes only canonical URLs, deduplicates by normalized URL, and supports regex include/exclude filters.

Input

FieldTypeDefaultDescription
startUrlstringrequiredStart page (http/https)
maxDepthinteger3Max crawl depth (startUrl is depth 0)
maxPagesinteger1000Max fetched pages
concurrencyinteger10Concurrent HTTP workers (1-50)
allowNoindexbooleanfalseIf true, includes pages with noindex directives
sitemapSeedUrlsstring[][]Optional sitemap XML URLs to seed discovery (in addition to robots.txt Sitemap: entries)
includePatternsstring[][]Regex allow-list for URLs
excludePatternsstring[][]Regex deny-list for URLs
outputFormatsstring[]["html","txt"]Optional extra outputs (html, txt)
lastmodStrategystringheadersheaders or crawl_time
changefreqstringweeklyDefault sitemap changefreq
priorityRules.defaultPrioritynumber0.5Default sitemap priority
priorityRules.rulesobject[][]Ordered regex overrides (pattern, optional priority, optional changefreq)

Run Locally (Apify)

  1. Put your JSON input into storage/key_value_stores/default/INPUT.json.
  2. Run:
$apify run

Example INPUT.json:

{
"startUrl":"https://example.com/",
"maxDepth":2,
"maxPages":500,
"concurrency":10,
"allowNoindex":false,
"includePatterns":[],
"excludePatterns":["/private","/preview"],
"outputFormats":["html","txt"],
"lastmodStrategy":"headers",
"changefreq":"weekly",
"priorityRules":{
"defaultPriority":0.5,
"rules":[
{"pattern":"/docs/","priority":0.8,"changefreq":"daily"}
]
}
}

Run Locally (CLI)

The module also supports direct CLI flags:

python -m src \
--start-url https://example.com/ \
--max-depth 2\
--max-pages 500\
--concurrency10\
--allow-noindex \
--sitemap-seed-url https://example.com/sitemap.xml \
--exclude-pattern /private \
--output-format html \
--output-format txt \
--lastmod-strategy headers \
--changefreq weekly

Priority rules from CLI can be provided as JSON string or file path:

$python -m src --start-url https://example.com/ --priority-rules-json priority-rules.json

CLI precedence is higher than Actor input: CLI > INPUT JSON > defaults.

Output Locations

  • Dataset: one item per included canonical URL (url, lastmod, changefreq, priority, depth, sourceUrl, statusCode, discoveredAt)
  • Key-value store records:
    • sitemap.xml
    • sitemap-00001.xml, sitemap-00002.xml, ...
    • sitemap.html (if enabled)
    • sitemap.txt (if enabled)
    • sitemap-summary.json

Run Tests

$python -m unittest discover -s tests -p"test_*.py"

Integration fixture site is under fixtures/site/.

You might also like

Sitemap Generator

igview-owner/sitemap-generator

Automatically crawl any website and generate XML, HTML, and text sitemaps for SEO optimization. Perfect for submitting to Google Search Console, Bing Webmaster Tools, and improving search engine indexing. no manual work required. Free sitemap generator tool for WordPress, Blogger, and all website.

๐Ÿ‘ User avatar

Sachin Kumar Yadav

31

Fast Sitemap Generator

eunit/sitemap-generator

Boost SEO with this automatic Sitemap Generator. Crawl any site to create XML, HTML, & TXT sitemaps. Supports custom depth, regex filters, & robots.txt. Compatible with Google Search Console.

๐Ÿ‘ User avatar

Emmanuel Uchenna

30

5.0

(1)

Sitemap Generator

alizarin_refrigerator-owner/sitemap-generator

Generate XML sitemaps by crawling any website. Discover all pages, images, & videos with configurable crawl depth, URL filters, & multiple output formats. Full Site Crawling ,Image Sitemap, Video Sitemap, Multiple Output Formats, URL Filtering, Configurable Depth, Last Modified, Webhook Integration

Sitemap Generator - Creates sitemap.xml for any domain

wisteria_banjo/sitemap-generator---creates-sitemap-xml-for-any-domain

Generate a clean, standards-compliant sitemap.xml for a website. This actor crawls a single website, discovers all indexable pages, and produces: โœ… A ready-to-submit sitemap.xml (Google-compliant) โœ… A structured JSON dataset of discovered URLs (for auditing, reporting, and billing)

13

Sitemap Generator

gentle_cloud/sitemap-generator

Crawl websites and generate XML sitemaps with configurable depth and page limits. Discover all pages, extract metadata, and output a ready-to-use sitemap.xml.

Sitemap Generator โ€” Full-Site URL Discovery & Crawling

junipr/sitemap-generator

Generate XML sitemaps by crawling websites. Link following, robots.txt respect, configurable depth/limits. Valid XML with lastmod, changefreq, priority. URL inventory with status codes. Ideal for SEO and migrations.

Sitemap Generator

datawinder/sitemap-generator

Automatically crawl a website and generate an SEO-ready sitemap in XML, HTML, or TXT format. Supports crawl depth limits, URL include/exclude patterns, and optional merging with an existing sitemap.xml. Ideal for SEO audits, site migrations, and automation.

๐Ÿ‘ User avatar

DatawinderLabs

2

Sitemap URL Extractor

onescales/sitemap-url-extractor

Provide a website link to a sitemap.xml and the app will extract and list all URLs in the sitemap as well as additional data in the sitemap (i.e. https://onescales.com/sitemap.xml).

570

5.0

(3)

YouTube Trending Videos by Categories Scraper

eunit/youtube-trending-videos-by-categories

Scrape real-time YouTube trending videos by country and category. Get detailed metrics like rank, title, views, likes, and channel name from 25+ countries. Monitor viral content, music, gaming, and more.

๐Ÿ‘ User avatar

Emmanuel Uchenna

156

1.1

(2)