VOOZH about

URL: https://apify.com/taroyamada/site-qa-indexability-ai-crawler-report-scraper

⇱ Site QA Indexability AI Crawler Report Scraper Β· Apify


πŸ‘ Site QA Indexability AI Crawler Report Scraper avatar

Site QA Indexability AI Crawler Report Scraper

Pricing

from $30.00 / 1,000 ai crawler policy checkeds

Go to Apify Store

Site QA Indexability AI Crawler Report Scraper

Unofficially audit user-supplied public pages, robots.txt, and llms.txt signals for AI crawler indexability issues and source-linked report rows.

Pricing

from $30.00 / 1,000 ai crawler policy checkeds

Rating

0.0

(0)

Developer

πŸ‘ naoki anzai

naoki anzai

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

a month ago

Last modified

Share

Site owners, SEO agencies, and content teams use this actor to audit public indexability and AI crawler access signals. Provide public URLs and optional AI crawler user-agent names. The actor returns source-linked policy observations, indexability issues, reports, and export rows.

Store Quickstart

Run with dryRun=false and public URLs that you own or are allowed to audit.

{
"urls":["https://example.com/?siteQaCanary=indexability-ai-crawler-v1"],
"aiCrawlerUserAgents":["GPTBot","Google-Extended","PerplexityBot","ClaudeBot"],
"checkRobotsTxt":true,
"checkLlmsTxt":true,
"authorizedUseConfirmed":true,
"generateReport":true,
"emitUnchanged":false,
"dryRun":false
}

Input Examples

Audit one page and origin policies

{
"urls":["https://example.com/blog/launch"],
"aiCrawlerUserAgents":["GPTBot","ClaudeBot"],
"checkRobotsTxt":true,
"checkLlmsTxt":true,
"authorizedUseConfirmed":true,
"dryRun":false
}

Batch audit a site section

{
"urls":[
"https://example.com/",
"https://example.com/pricing",
"https://example.com/docs"
],
"maxPages":25,
"emitPageRows":false,
"generateReport":true,
"authorizedUseConfirmed":true,
"dryRun":false
}

Generate a handoff export

{
"urls":["https://example.com/landing-page"],
"aiCrawlerUserAgents":["GPTBot","Google-Extended","PerplexityBot"],
"emitExport":true,
"emitUnchanged":false,
"authorizedUseConfirmed":true,
"dryRun":false
}

Sample Output

{
"actorName":"site-qa-indexability-ai-crawler-report-scraper",
"rowType":"indexability_issue",
"billingEventName":"indexability-issue-detected",
"issueType":"ai_crawler_disallowed_by_robots",
"severity":"high",
"sourceUrl":"https://example.com/?siteQaCanary=indexability-ai-crawler-v1"
}

Output Fields

  • rowType: ai_crawler_policy_observation, indexability_issue, ai_crawler_indexability_report, or indexability_export.
  • billingEventName: PAY_PER_EVENT event name used for the row.
  • sourceUrl: public URL or policy file that supports the row.
  • issueType: detected source-linked issue when applicable.
  • blockedUserAgents: crawler names with broad robots.txt blocks when detected.

Pricing And No-Change Runs

  • ai-crawler-policy-checked: $0.030 per public robots.txt or llms.txt policy observation.
  • indexability-issue-detected: $0.120 per source-linked indexability issue.
  • ai-crawler-indexability-report: $6.000 per site-level report.
  • indexability-export-generated: $8.000 per generated export.

When emitUnchanged=false, repeated unchanged runs emit zero dataset rows and zero charges after state is saved.

Compliance Guardrails

  • Public pages, robots.txt, and llms.txt only.
  • No login, paywall, CAPTCHA, private session, credentialed API, or bypass behavior.
  • Non-dry runs require authorizedUseConfirmed=true; use this only for sites you own, manage, or are allowed to audit.
  • This is an unofficial audit tool and is not affiliated with any crawler, search engine, or AI provider.
  • No ranking guarantee, AI citation guarantee, legal conclusion, or compliance certification.

Bundle Paths

See Also

You might also like

Indexability Audit

zerobreak/indexability-audit

Indexability audit tool that checks robots.txt, meta robots tags, X-Robots-Tag headers, and canonical URLs for any list of pages, so SEO teams know which ones Google can actually crawl and index.

Ai Visibility Suite - Dark Visitors Alternative

alizarin_refrigerator-owner/ai-visibility-suite---dark-visitors-alternative

Comprehensive AI bot monitoring, robots.txt analysis, LLMs.txt generation & AI shopping optimization. Monitor AI crawlers visits, check AI compliance, generate AI-friendly configurations, and optimize for AI shopping agents. AI Bot Directory Robots.txt LLMs.txt AI Shopping Competitor AI Audit

GEO Site Audit - AI Readiness Checker

dltik/geo-site-audit

Audit your website for AI crawler accessibility: robots.txt (GPTBot, ClaudeBot, Perplexity), llms.txt, sitemap, Schema.org, meta tags, content extractability, TTFB. Get an AI-readiness score 0-100 with prioritized recommendations.

AI Readiness Auditor

rationalistic_counsel/ai-readiness-auditor

Check how AI-ready any website is. Get an AI Readiness Score (0-100) checking llms.txt, robots.txt AI crawler directives, Schema.org structured data, and meta tags. No API key needed.

AI Readiness Checker - Website Scanner

alizarin_refrigerator-owner/ai-readiness-checker

Analyze any website for AI optimization readiness. Check robots.txt, llms.txt, structured data, meta tags & content quality. Get actionable recommendations to improve AI crawler accessibility.

Robots Txt Audit

apage/robots-txt-audit

Audit robots.txt files for AI crawler access. Get an AI Readiness Score (0-100), analyze 61+ AI crawlers (ChatGPT, Claude, Perplexity, Gemini), detect syntax errors, security concerns, and get actionable recommendations. Batch audit multiple domains at once with optional subdomain scanning.