VOOZH about

URL: https://deepwiki.com/calevans/staticforge/8.3-seo-and-discovery-features

⇱ SEO & Discovery Features | calevans/staticforge | DeepWiki


Loading...
Last indexed: 11 February 2026 (5f6a2a)
Menu

SEO & Discovery Features

This document covers StaticForge's built-in features for search engine optimization and content discovery: Sitemap, RSS Feed, and RobotsTxt. These features automatically generate machine-readable files (sitemap.xml, rss.xml, robots.txt) that help search engines and feed readers discover and index your content.

For content organization features like categories and tags, see Navigation & Structure Features. For search functionality, see Interactive Features.


Overview

StaticForge includes three SEO-focused features that operate during the build pipeline:

FeatureOutput File(s)Primary Use CaseEvents
Sitemapsitemap.xmlSearch engine crawlingPOST_RENDER, POST_LOOP
RssFeed{category}/rss.xmlContent syndicationPOST_RENDER, POST_LOOP
RobotsTxtrobots.txtCrawler access controlPOST_GLOB, POST_LOOP

All three features follow a collect-then-generate pattern: they accumulate data during the rendering loop, then write output files during the POST_LOOP event.

Sources: src/Features/Sitemap/Services/SitemapService.php1-122 src/Features/RssFeed/Services/RssFeedService.php1-305 src/Features/RobotsTxt/Services/RobotsTxtService.php1-214


Architecture: Event-Driven Collection Pattern


Key architectural patterns:

  1. Two-phase operation: Collection during rendering, generation during POST_LOOP
  2. Stateful services: Each service maintains internal arrays to accumulate data
  3. Independent features: No direct dependencies between the three features
  4. Shared data sources: All read from discovered_files and container variables

Sources: src/Features/Sitemap/Services/SitemapService.php26-121 src/Features/RssFeed/Services/RssFeedService.php30-145 src/Features/RobotsTxt/Services/RobotsTxtService.php28-102


Sitemap Feature

Purpose

Generates sitemap.xml containing all rendered pages with URLs and modification dates. This file helps search engines discover and crawl your content efficiently.

Event Registration


Sources: src/Features/Sitemap/Services/SitemapService.php31-77 src/Features/Sitemap/Services/SitemapService.php86-121

Data Collection Flow

During POST_RENDER, the service collects URL data:


URL construction logic:

  1. Remove OUTPUT_DIR prefix from output_path: public/foo/bar.htmlfoo/bar.html
  2. Prepend SITE_BASE_URL: https://example.com/ + foo/bar.html
  3. Result: https://example.com/foo/bar.html

Sources: src/Features/Sitemap/Services/SitemapService.php31-77

XML Generation

During POST_LOOP, generates XML conforming to sitemaps.org schema:


Implementation: src/Features/Sitemap/Services/SitemapService.php86-121

Output location: {OUTPUT_DIR}/sitemap.xml

Configuration

No configuration required. The feature automatically includes all rendered pages.

Required container variables:

  • OUTPUT_DIR - Where to write sitemap.xml
  • SITE_BASE_URL - Base URL for constructing absolute URLs

Sources: src/Features/Sitemap/Services/SitemapService.php43-50 src/Features/Sitemap/Services/SitemapService.php108-112


RSS Feed Feature

Purpose

Generates category-specific RSS 2.0 feeds for content syndication. Each category with files gets its own RSS feed at /{category-slug}/rss.xml.

Event Registration & Extension Points


Extension mechanism: The RSS feature fires its own events (RSS_BUILDER_INIT, RSS_ITEM_BUILDING) to allow other features (like Podcast) to augment feed metadata.

Sources: src/Features/RssFeed/Services/RssFeedService.php30-88 src/Features/RssFeed/Services/RssFeedService.php96-145 src/Features/RssFeed/Services/RssFeedService.php179-218

Data Collection by Category


Category file organization:


Sources: src/Features/RssFeed/Services/RssFeedService.php36-88

Description Extraction Logic

The service extracts descriptions using a priority fallback:

  1. Explicit metadata: metadata['description'] if present
  2. Auto-extract: First 200 characters of stripped HTML content
    • Strip all HTML tags
    • Collapse whitespace
    • Truncate at word boundary (last space before 200 chars)
    • Append ...

Implementation: src/Features/RssFeed/Services/RssFeedService.php248-274

RSS Feed Generation


Output structure:

public/
├── tech/
│ └── rss.xml # RSS feed for 'Tech' category
├── blog/
│ └── rss.xml # RSS feed for 'Blog' category
└── sitemap.xml

Sources: src/Features/RssFeed/Services/RssFeedService.php96-145 src/Features/RssFeed/Services/RssFeedService.php156-233

RSS 2.0 XML Structure


Sources: src/Features/RssFeed/Services/RssFeedService.php185-221

Category Definitions and Metadata

RSS generation integrates with category definition files:


The service reads these definitions to populate feed metadata:

Implementation: src/Features/RssFeed/Services/RssFeedService.php119-141

Configuration

No explicit configuration required. RSS feeds are automatically generated for any category with content.

Required container variables:

  • OUTPUT_DIR - Where to write RSS files
  • SITE_BASE_URL - Base URL for feed links
  • site_config['site']['name'] - Site name for feed title

Sources: src/Features/RssFeed/Services/RssFeedService.php106-118


RobotsTxt Feature

Purpose

Generates robots.txt to control search engine crawler access. Supports both individual page exclusions (robots: no in frontmatter) and entire category exclusions.

Event Registration


Sources: src/Features/RobotsTxt/Services/RobotsTxtService.php28-54 src/Features/RobotsTxt/Services/RobotsTxtService.php56-102

Metadata Scanning (POST_GLOB)

The service scans discovered_files during POST_GLOB to build a list of disallowed paths:


Path calculation logic:

  1. Remove SOURCE_DIR prefix from file path
  2. Convert file extension to .html: .md.html
  3. Prepend /: foo/bar.html/foo/bar.html

Category path calculation:

  1. Extract category slug from filename or metadata
  2. Sanitize: lowercase + replace non-alphanumeric with -
  3. Format as directory: /{category-slug}/

Sources: src/Features/RobotsTxt/Services/RobotsTxtService.php104-160 src/Features/RobotsTxt/Services/RobotsTxtService.php162-187

Frontmatter Control


Results in robots.txt entry:

Disallow: /private-document.html

Category-wide exclusion:


Results in:

Disallow: /internal-documentation/

This excludes all pages in that category directory.

Sources: src/Features/RobotsTxt/Services/RobotsTxtService.php108-126 src/Features/RobotsTxt/Services/RobotsTxtService.php130-160

robots.txt Generation


Output format:

# robots.txt generated by StaticForge
# 2024-01-15 10:30:00

User-agent: *
Disallow: /admin/secret.html
Disallow: /internal-docs/
Disallow: /private.html

# Sitemap location
Sitemap: https://example.com/sitemap.xml

Sources: src/Features/RobotsTxt/Services/RobotsTxtGenerator.php9-43

Case Insensitivity

The robots field value is case-insensitive:

  • robots: no
  • robots: NO
  • robots: No

All are treated identically.

Implementation: src/Features/RobotsTxt/Services/RobotsTxtService.php114-117

Configuration

No configuration required. The feature automatically processes all files with robots metadata.

Required container variables:

  • OUTPUT_DIR - Where to write robots.txt
  • SITE_BASE_URL - For sitemap reference in robots.txt
  • SOURCE_DIR - For path calculations

Sources: src/Features/RobotsTxt/Services/RobotsTxtService.php71-79


Integration: File Output Structure

The three features collectively produce these files:

public/
├── sitemap.xml # Sitemap Feature
├── robots.txt # RobotsTxt Feature
├── tech/
│ ├── rss.xml # RSS Feature (Tech category)
│ ├── article1.html
│ └── article2.html
├── blog/
│ ├── rss.xml # RSS Feature (Blog category)
│ ├── post1.html
│ └── post2.html
└── index.html

Sitemap reference in robots.txt:

Sitemap: https://example.com/sitemap.xml

RSS feed discovery: RSS feed URLs follow the pattern /{category-slug}/rss.xml and can be discovered via:

  • <link rel="alternate" type="application/rss+xml"> in HTML <head>
  • Direct URL construction from category name

Sources: src/Features/Sitemap/Services/SitemapService.php112-113 src/Features/RssFeed/Services/RssFeedService.php224-230 src/Features/RobotsTxt/Services/RobotsTxtGenerator.php34-39


Code Entity Reference

Key Classes and Services

ClassPathResponsibility
SitemapServicesrc/Features/Sitemap/Services/SitemapService.phpURL collection and sitemap generation
RssFeedServicesrc/Features/RssFeed/Services/RssFeedService.phpCategory file collection and RSS feed generation
RobotsTxtServicesrc/Features/RobotsTxt/Services/RobotsTxtService.phpRobots metadata scanning and robots.txt generation
RobotsTxtGeneratorsrc/Features/RobotsTxt/Services/RobotsTxtGenerator.phprobots.txt formatting logic
RssBuildersrc/Features/RssFeed/Services/RssBuilder.phpRSS XML construction
FeedChannelsrc/Features/RssFeed/Models/FeedChannel.phpRSS channel data model
FeedItemsrc/Features/RssFeed/Models/FeedItem.phpRSS item data model

Event Handlers


Sources: src/Features/Sitemap/Services/SitemapService.php31-121 src/Features/RssFeed/Services/RssFeedService.php30-233 src/Features/RobotsTxt/Services/RobotsTxtService.php28-102


Common Metadata Fields

These frontmatter fields affect SEO features:

FieldTypeUsed ByEffect
robotsstringRobotsTxtno = exclude from robots.txt
categorystringRssFeedDetermines RSS feed file grouping
datestring/timestampAllPublication date for sitemap/RSS
descriptionstringRssFeedRSS item description (overrides auto-extract)
typestringRobotsTxtcategory enables category-wide exclusion
published_datestringRssFeedPreferred date field for RSS (overrides date)

Example combining multiple SEO fields:


Sources: src/Features/Sitemap/Services/SitemapService.php55-70 src/Features/RssFeed/Services/RssFeedService.php276-294 src/Features/RobotsTxt/Services/RobotsTxtService.php114-117


Testing

Unit Tests

Each service has comprehensive unit tests:

Sources: tests/Unit/Features/Sitemap/SitemapServiceTest.php1-105 tests/Unit/Features/RobotsTxt/RobotsTxtServiceTest.php1-125


Extension Points

RSS Events

The RSS feature fires custom events to support extensions:

RSS_BUILDER_INIT

  • Fired once per feed before building
  • Parameters: builder (RssBuilder), category_metadata (array)
  • Use case: Configure podcast extensions, add custom namespaces

RSS_ITEM_BUILDING

  • Fired once per item during feed construction
  • Parameters: item (FeedItem), file (array)
  • Use case: Add enclosures, media metadata, custom fields

Example usage (Podcast feature):


Sources: src/Features/RssFeed/Services/RssFeedService.php179-183 src/Features/RssFeed/Services/RssFeedService.php211-215


Troubleshooting

Sitemap Issues

Empty sitemap.xml:

  • Check SITE_BASE_URL is set in .env
  • Verify files are actually rendering (check POST_RENDER events)

Wrong URLs in sitemap:

  • Verify SITE_BASE_URL matches your deployment domain
  • Check category configurations aren't creating unexpected paths

RSS Issues

No RSS files generated:

  • Ensure content files have category metadata
  • Check at least one file per category exists
  • Verify OUTPUT_DIR has write permissions

Missing descriptions:

  • Add explicit description field to frontmatter
  • Ensure content has text (not just images) for auto-extraction

RobotsTxt Issues

Paths not excluded:

  • Verify robots: no is lowercase (case-insensitive but best practice)
  • Check frontmatter YAML syntax is valid
  • Ensure POST_GLOB event is firing before rendering

Category not excluded:

  • Verify category definition file has type: category
  • Check robots: no is in category definition file, not content files

Sources: src/Features/Sitemap/Services/SitemapService.php43-50 src/Features/RssFeed/Services/RssFeedService.php98-104 src/Features/RobotsTxt/Services/RobotsTxtService.php114-125