VOOZH about

URL: https://apify.com/automation-lab/bulk-url-status-checker

โ‡ฑ Bulk URL Status Checker | Redirect & Broken Link Audit ยท Apify


Pricing

Pay per event

Go to Apify Store

Bulk URL Status Checker

Bulk check URLs for status codes, redirects, broken links, response times, canonical tags, robots meta, headers, and final destinations.

Pricing

Pay per event

Rating

0.0

(0)

Developer

๐Ÿ‘ Stas Persiianenko

Stas Persiianenko

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

6 days ago

Last modified

Share

Check bulk URLs for HTTP status codes, redirect chains, broken links, response timing, canonical URLs, robots meta tags, content type, and final destination URLs.

Use this actor when you need a repeatable API-friendly URL audit for SEO, website migrations, QA, campaign launches, and content operations.

What does Bulk URL Status Checker do?

Bulk URL Status Checker takes URLs from pasted lists, text blocks, hosted URL lists, or XML sitemaps.

It checks each URL over HTTP and returns a structured dataset row for every URL.

The actor reports status code, status text, final URL, redirect count, redirect chain, broken-link flag, response time, content type, content length, canonical URL, robots meta, and error metadata.

It is designed for operational checks where the status itself is the data.

If a target page returns 403, 404, 500, timeout, or another failure, the actor records that response instead of treating the whole run as failed.

Who is it for?

  • ๐Ÿ”Ž SEO agencies auditing migrations and technical SEO fixes.
  • ๐Ÿงญ Website migration teams validating old-to-new URL maps.
  • ๐Ÿงช QA teams checking landing pages before releases.
  • ๐Ÿ“ฐ Content operations teams finding removed or redirected articles.
  • ๐Ÿ“ˆ Growth teams checking campaign URLs before launch.
  • ๐Ÿงฐ Developers building status-check APIs into internal dashboards.
  • ๐Ÿงพ Analysts who need CSV, JSON, Excel, or API exports from URL checks.

Why use this URL status checker?

A simple browser test is not enough when you have hundreds or thousands of URLs.

This actor gives you a repeatable Apify run, dataset exports, API access, webhooks, and scheduling.

You can run it after deployments, before ad campaigns, during SEO migrations, and as part of weekly site health checks.

It is HTTP-only and lightweight, so it is cheaper than browser crawlers for status-code workflows.

Key features

  • โœ… Bulk HTTP status checks.
  • โœ… HEAD with GET fallback for speed and compatibility.
  • โœ… GET-only mode for servers that reject HEAD.
  • โœ… Redirect following with redirect-chain details.
  • โœ… Broken-link classification.
  • โœ… Response-time measurement.
  • โœ… Canonical URL extraction from HTML.
  • โœ… Robots meta extraction from HTML.
  • โœ… Content type and content length.
  • โœ… Optional raw response headers.
  • โœ… Sitemap URL ingestion.
  • โœ… Hosted plain-text or CSV-like URL list ingestion.
  • โœ… Configurable concurrency, timeout, and user-agent.

How much does it cost to check bulk URL status codes?

This actor uses pay-per-event pricing.

There is a small start fee and a per-URL checked fee.

The default starting price in the actor package is:

  • Start event: $0.005 per run.
  • URL checked event: $0.000069405 at the BRONZE tier, with volume discounts for higher tiers.

Pricing was calculated from cloud runs using the standard 70% target NET margin formula.

Input sources

You can provide URLs in four ways.

  1. urls: direct list of URLs.
  2. urlsText: pasted text containing URLs separated by newlines, spaces, commas, tabs, or semicolons.
  3. sitemapUrl: XML sitemap URL; the actor extracts <loc> entries.
  4. listUrl: hosted text or CSV-like file containing URLs.

At least one source is required.

Duplicates are removed after normalization.

Input options

FieldTypeDescription
urlsarrayURLs to check directly.
urlsTextstringPasted URL block.
sitemapUrlstringXML sitemap URL to parse.
listUrlstringHosted text or CSV-like URL list.
maxUrlsintegerMaximum unique URLs to check.
maxConcurrencyintegerParallel URL checks.
timeoutSecsintegerRequest timeout per URL.
followRedirectsbooleanFollow redirects and report the final URL.
methodstringhead-get-fallback or get.
includeHtmlSignalsbooleanExtract canonical and robots meta.
includeHeadersbooleanInclude raw response headers.
userAgentstringOptional custom User-Agent.

Example input

{
"urls":[
"https://example.com/",
"https://www.iana.org/domains/example",
"https://httpstat.us/404"
],
"maxUrls":100,
"maxConcurrency":20,
"timeoutSecs":15,
"followRedirects":true,
"method":"head-get-fallback",
"includeHtmlSignals":true,
"includeHeaders":false
}

Sitemap audit example

{
"sitemapUrl":"https://www.iana.org/sitemap.xml",
"maxUrls":500,
"maxConcurrency":10,
"includeHtmlSignals":true
}

Use this mode to audit indexed URLs, migration sitemaps, or generated sitemap files.

Output data

Each dataset item represents one URL check.

FieldDescription
inputUrlOriginal URL supplied by the user.
normalizedUrlURL after scheme normalization and hash removal.
statusCodeFinal HTTP status code, or null on request error.
statusTextHuman-readable status text when known.
finalUrlFinal URL after redirects.
redirectChainArray of redirect hops with URL, status, and location.
redirectCountNumber of redirect hops.
isBrokenTrue for request errors or HTTP status 400+.
isRedirectTrue when at least one redirect was followed.
responseTimeMsRequest duration in milliseconds.
contentTypeResponse Content-Type header.
contentLengthResponse Content-Length header when available.
canonicalUrlCanonical URL extracted from HTML, when requested.
robotsMetaRobots meta content extracted from HTML, when requested.
errorTypeRequest error code or error name.
errorMessageRequest error message.
checkedAtISO timestamp for the check.
headersOptional raw response headers.

Example output

{
"inputUrl":"https://example.com/",
"normalizedUrl":"https://example.com/",
"statusCode":200,
"statusText":"OK",
"finalUrl":"https://example.com/",
"redirectChain":[],
"redirectCount":0,
"isBroken":false,
"isRedirect":false,
"responseTimeMs":184,
"contentType":"text/html",
"contentLength":1256,
"canonicalUrl":null,
"robotsMeta":null,
"errorType":null,
"errorMessage":null,
"checkedAt":"2026-06-22T00:00:00.000Z"
}

Redirect chain checks

When followRedirects is enabled, the actor follows redirects up to the HTTP client limit.

The final dataset row still represents the original URL.

The redirectChain field stores each hop with the source URL, status code, and Location header.

Use this for migration maps, HTTP-to-HTTPS checks, trailing-slash cleanup, and canonical destination validation.

Broken link checks

isBroken is true when the final status code is 400 or higher.

It is also true for invalid URLs, timeouts, DNS errors, TLS errors, and connection failures.

Blocked URLs such as 401, 403, or 429 are preserved as HTTP status results.

That makes the actor useful for reporting what happened instead of hiding protected URLs as run failures.

Canonical and robots meta checks

When includeHtmlSignals is true, the actor parses HTML pages for:

  • canonical link: <link rel="canonical" href="...">
  • robots meta: <meta name="robots" content="...">

This is useful for SEO QA after site migrations and template changes.

The actor only attempts these checks for HTML responses.

Performance tips

Start with maxConcurrency 10-20 for general websites.

Use lower concurrency for small sites, fragile servers, or URLs behind rate limits.

Use head-get-fallback for most runs because HEAD is fast and GET fallback handles servers that reject HEAD.

Use get when you know target servers return inaccurate HEAD responses.

Keep includeHeaders disabled unless you need raw headers in exports.

Integrations

You can integrate this actor into many workflows:

  • Schedule a weekly sitemap status audit.
  • Trigger a run after a deployment.
  • Send broken-link results to Slack through Apify webhooks.
  • Export redirect chains to Google Sheets.
  • Pull dataset items into a BI tool.
  • Use API results in internal QA dashboards.
  • Compare old and new migration URL maps in a data warehouse.

API usage with Node.js

import{ ApifyClient }from'apify-client';
const client =newApifyClient({token: process.env.APIFY_TOKEN});
const run =await client.actor('automation-lab/bulk-url-status-checker').call({
urls:['https://example.com/','https://httpstat.us/404'],
followRedirects:true,
});
const{ items }=await client.dataset(run.defaultDatasetId).listItems();
console.log(items);

API usage with Python

from apify_client import ApifyClient
client = ApifyClient('YOUR_APIFY_TOKEN')
run = client.actor('automation-lab/bulk-url-status-checker').call(run_input={
'urls':['https://example.com/','https://httpstat.us/404'],
'followRedirects':True,
})
items = client.dataset(run['defaultDatasetId']).list_items().items
print(items)

API usage with cURL

curl-X POST 'https://api.apify.com/v2/acts/automation-lab~bulk-url-status-checker/runs?token=YOUR_APIFY_TOKEN'\
-H'Content-Type: application/json'\
-d'{"urls":["https://example.com/","https://httpstat.us/404"],"followRedirects":true}'

MCP integration

Use Apify MCP to run this actor from Claude Desktop, Claude Code, or other MCP clients.

MCP URL:

https://mcp.apify.com/?tools=automation-lab/bulk-url-status-checker

Add it in Claude Code:

$claude mcp add apify-bulk-url-status-checker https://mcp.apify.com/?tools=automation-lab/bulk-url-status-checker

Claude Desktop JSON configuration:

{
"mcpServers":{
"apify-bulk-url-status-checker":{
"url":"https://mcp.apify.com/?tools=automation-lab/bulk-url-status-checker"
}
}
}

Example prompts:

  • "Check these 50 URLs and summarize the broken links."
  • "Run a sitemap status audit and group results by status code."
  • "Find redirects in this migration URL list and export final URLs."

Data quality notes

HTTP status checks depend on target server behavior.

Some servers treat HEAD and GET differently.

Some servers block datacenter traffic, unknown user agents, or high concurrency.

For those cases, use GET mode, lower concurrency, and a custom user agent that identifies your crawler policy.

The actor reports the observed result rather than attempting to bypass access controls.

Troubleshooting

Why do I see 403 or 429?

The target server is refusing or rate limiting requests.

Lower concurrency, use a custom user agent, or check whether your organization allows automated checks against that domain.

Why is contentLength null?

Many servers use chunked transfer encoding or omit Content-Length.

The actor reports null when the header is missing.

Why is canonicalUrl null?

The page may not be HTML, canonical extraction may be disabled, or the page may not contain a canonical tag.

Legality and ethical use

Only check URLs you are allowed to audit.

Respect robots policies, rate limits, and site terms.

Do not use the actor to overload third-party servers.

The actor is intended for diagnostics, QA, SEO operations, and link-health monitoring.

Related scrapers and tools

Use the simpler HTTP Status Checker for small one-off status checks.

Use Bulk URL Status Checker when you need sitemap/list ingestion, canonical hints, robots meta, and richer redirect/broken-link audit fields.

FAQ

Can I check thousands of URLs?

Yes. Increase maxUrls and choose a concurrency that is safe for the target domains.

Does it use a browser?

No. It is an HTTP-only actor for status and header checks.

Does it scrape page content?

No. It only fetches enough page HTML to extract canonical and robots meta when that option is enabled.

Can I schedule it?

Yes. Use Apify schedules to run it daily, weekly, or after deployments.

Can I export to CSV?

Yes. Apify datasets can be exported as JSON, CSV, Excel, XML, RSS, or HTML.

Changelog

  • 0.1.0: Initial build with URL list, text, sitemap, and hosted list ingestion; status checks; redirect chain output; canonical and robots meta extraction.

You might also like

Bulk URL Status Checker - Redirect & Broken Link Audit

webdata_labs/bulk-url-status-checker

[๐Ÿ’ต $2.00 / 1K] Check URLs in bulk for HTTP status codes, broken links, redirects, response times, final URLs, and redirect chains. Built for SEO audits, migrations, QA, and monitoring. CSV/JSON.

2

Bulk Website Uptime Checker

glowing_glove/website-uptime-health-checker

Check website availability in bulk and return status codes, redirects, final URLs, response times, and failure reasons.

HTTP Status Code Checker

automation-lab/http-status-checker

Check HTTP status codes and redirects in bulk for any list of URLs. Detect 404 errors, 301/302 redirects, redirect chains, and broken links for SEO audits and site maintenance.

๐Ÿ‘ User avatar

Stas Persiianenko

54

5.0

Website Status Checker

techionik9993/website-status-checker

Fast and reliable bulk website status checker. Monitor uptime, detect errors, track response times, and follow redirects for hundreds of URLs. Ideal for SEO audits, API monitoring, and automation workflows with clean structured output.

Bulk URL Status Checker

taroyamada/bulk-url-health-checker

Check large URL lists for HTTP status, redirect chains, response timing, and broken URL findings for QA and SEO operations.

HTTP Status Code Checker - Bulk URLs & Redirect Chains

dltik/http-status-checker

Bulk-check HTTP status codes and full redirect chains for any list of URLs. For each URL: final status, every redirect hop (status + Location), response time, content-type and server. Pure HTTP, reads headers only โ€” fast and cheap. Great for SEO migrations, link audits and uptime spot-checks.

Broken Link Checker

parseforge/broken-link-checker

Scan thousands of URLs instantly and detect broken links, 404s, redirects, and slow pages. Get comprehensive link health reports with status codes, response times, redirect chains, and detailed error information. Perfect for website maintenance, SEO audits, and quality assurance.

54

2.6

Http Status Scanner

zerobreak/http-status-scanner

HTTP status scanner that checks URL status codes and redirect chains in bulk. Built for SEO teams and developers who need to catch broken links and verify redirects at scale.