VOOZH about

URL: https://apify.com/jacksu/public-ai-crawler-policy-signal-agent

โ‡ฑ AI Crawler Policy Checker For robots.txt And llms.txt ยท Apify


๐Ÿ‘ Public AI Crawler Policy Signal Agent avatar

Public AI Crawler Policy Signal Agent

Pricing

Pay per event

Go to Apify Store

Public AI Crawler Policy Signal Agent

Analyze public robots.txt and llms.txt files for AI crawler allow/block policy evidence, LLM guidance files, stable hashes, and useful-result pricing.

Pricing

Pay per event

Rating

0.0

(0)

Developer

๐Ÿ‘ jack su

jack su

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

10 days ago

Last modified

Categories

Share

Analyze public site-level policy files for AI crawler and LLM-agent guidance: explicit AI crawler rules in robots.txt, public llms.txt / llms-full.txt files, stable policy hashes, diagnostics, and change-aware useful billing.

What It Reads

  • One public site origin, robots.txt, llms.txt, or llms-full.txt URL per input.
  • robots.txt user-agent blocks for known AI crawler tokens such as GPTBot, ChatGPT-User, OAI-SearchBot, ClaudeBot, PerplexityBot, Google-Extended, Applebot-Extended, CCBot, and similar public bot names.
  • Public llms.txt and optionally llms-full.txt files for headings, links, topics, preview text, and evidence URLs.

What It Does Not Do

  • It does not crawl pages, parse sitemap URLs, audit SEO metadata, run a browser, execute JavaScript, log in, fetch private pages, or inspect account areas.
  • It rejects private-network hosts, query strings, fragments, credentials, path parameters, sensitive account paths, and token-like path segments.
  • It does not decide legal permission. It only returns public policy evidence that a human or downstream agent can review.

Pricing Events

  • apify-actor-start: one tiny run-start event when configured in Apify.
  • useful-ai-crawler-policy-result: charged only for useful, new or changed AI crawler policy evidence.

Generic robots.txt files without AI-specific user-agent blocks and without llms.txt guidance are written as partial records and do not trigger the useful event. Unchanged hashes, invalid inputs, failed fetches, and missing policy evidence are also uncharged.

apify-default-dataset-item is intentionally not used.

Example Input

{
"siteUrls":["https://openai.com/"],
"includeLlmsFullTxt":true,
"requestTimeoutSecs":15
}

Output Highlights

  • policyType
  • aiCrawlerPolicySignals
  • knownAiProviders
  • knownAiBotUserAgents
  • wildcardRobotsPolicy
  • llmsTxtSignals
  • riskLabels
  • diagnostics
  • aiCrawlerPolicyHash
  • changeStatus
  • billableEventName

You might also like

robots.txt Parser & AI Crawler Block Checker

taroyamada/robotstxt-ai-checker

robots.txt parser that audits AI crawler block rules (GPTBot, ClaudeBot, anthropic-ai, PerplexityBot) across thousands of websites in one run. Returns per-bot allow/disallow disposition and crawl-delay.

AI llm.txt File Generator API

dev00/ai-llm-txt-file-generator-api

Generate llm.txt files automatically for any website. Map website directories and convert documentation into clean markdown structured llm.txt files for LLM agents.

dev00

2

Ai Visibility Suite - Dark Visitors Alternative

alizarin_refrigerator-owner/ai-visibility-suite---dark-visitors-alternative

Comprehensive AI bot monitoring, robots.txt analysis, LLMs.txt generation & AI shopping optimization. Monitor AI crawlers visits, check AI compliance, generate AI-friendly configurations, and optimize for AI shopping agents. AI Bot Directory Robots.txt LLMs.txt AI Shopping Competitor AI Audit

AI Readiness Checker - Website Scanner

alizarin_refrigerator-owner/ai-readiness-checker

Analyze any website for AI optimization readiness. Check robots.txt, llms.txt, structured data, meta tags & content quality. Get actionable recommendations to improve AI crawler accessibility.

LLMs.txt Generator

onescales/the-llms-txt-generator

The most powerful LLMs.txt Generator tool online. Generates LLMs.txt , llms-full.txt and markdown .md files within seconds! Get your website discovered, and recommended by ChatGPT, Claude, Google Gemini, Perplexity, Grok, and every AI. (Great for AEO, AIO, GEO, AI SEO) Made by Hi LLMs

118

5.0