VOOZH about

URL: https://apify.com/automation-lab/wayback-machine-checker

โ‡ฑ Wayback Machine Checker โ€” Find URL Archive Snapshots ยท Apify


Pricing

Pay per event

Go to Apify Store

Wayback Machine Checker

This actor checks if URLs are archived in the Internet Archive Wayback Machine. It retrieves snapshot counts, oldest and newest archive dates, and direct links to archived versions. Uses both the Availability API and CDX API for comprehensive results.

Pricing

Pay per event

Rating

0.0

(0)

Developer

๐Ÿ‘ Stas Persiianenko

Stas Persiianenko

Maintained by Community

Actor stats

0

Bookmarked

41

Total users

3

Monthly active users

a month ago

Last modified

Share

Wayback Machine Checker for URL Archive Monitoring

Check Internet Archive Wayback Machine availability and snapshot history for any list of URLs.

Best first run: check 10 important URLs

Paste your homepage plus a handful of high-value pages from a migration, SEO audit, legal hold, or content inventory. The output confirms archive coverage, first/last captures, snapshot counts, and direct Wayback URLs before you upload a full site list.

Typical cost: 10 URLs cost about $0.055 ($0.035 start + 10 ร— $0.002/URL). Use a small sample first, then scale to hundreds or thousands of URLs.

What does Wayback Machine Checker do?

This actor checks if URLs are archived in the Internet Archive Wayback Machine. It retrieves snapshot counts, oldest and newest archive dates, and direct links to archived versions. Uses both the Availability API and CDX API for comprehensive results, giving you a full picture of each URL's archive history. Process entire lists of domains or pages at once to quickly assess web presence history across your portfolio.

Use cases

  • Domain investors -- check website history and archive age before purchasing a domain to assess its legitimacy
  • Content recovery specialists -- find archived versions of deleted or lost web pages and retrieve their content
  • Historians and researchers -- study how websites evolved over time with timestamped snapshots spanning decades
  • SEO professionals -- find broken pages with archived content for link reclamation and redirect opportunities
  • Journalists -- verify past claims by accessing archived versions of news articles, press releases, and public statements
  • Legal teams -- document web page history for intellectual property disputes or compliance investigations
  • Digital archivists -- audit which pages from a collection are preserved in the Wayback Machine and which are missing

Why use Wayback Machine Checker?

  • Batch processing -- check hundreds of URLs against the Wayback Machine in a single run instead of searching manually
  • Dual API approach -- uses both the Availability API and CDX API for more complete and reliable results than either alone
  • Structured output -- returns snapshot counts, dates, and direct archive URLs in clean JSON ready for analysis
  • Age calculation -- automatically computes how many years a URL has been archived, useful for domain valuation
  • Direct snapshot links -- provides clickable URLs to the oldest archived version so you can view historical content immediately
  • API and schedule ready -- automate archive checks via the Apify API or scheduled runs for ongoing monitoring
  • Pay-per-event pricing -- only pay for the URLs you check, starting at $0.002 per URL

Input parameters

ParameterTypeRequiredDefaultDescription
urlsarrayYes--List of URLs to check on the Wayback Machine

You can provide any publicly accessible URL, including deep subpages, not just root domains. Each URL is checked independently against the Internet Archive APIs.

Input example

{
"urls":[
"https://www.google.com",
"https://www.wikipedia.org",
"https://example.com"
]
}

Output example

Each result includes the URL, availability status, snapshot count, date range, a direct link to the oldest snapshot, and the computed archive age in years.

{
"url":"https://www.google.com",
"isAvailable":true,
"snapshotCount":10000,
"oldestSnapshot":"1998-12-02",
"newestSnapshot":"2026-02-28",
"oldestSnapshotUrl":"https://web.archive.org/web/19981202230410/https://www.google.com",
"firstArchiveYear":1998,
"archiveAgeYears":27.2,
"error":null,
"checkedAt":"2026-03-01T12:00:00.000Z"
}

How to check URLs on the Wayback Machine

  1. Open Wayback Machine Checker in Apify Console.
  2. Enter a list of URLs you want to check against the Internet Archive.
  3. Click Start to run the checker.
  4. View results in the Dataset tab -- each URL shows snapshot count, oldest/newest archive dates, and direct links to archived versions.
  5. Download results as JSON, CSV, or Excel.

How much does it cost to check Wayback Machine availability?

Wayback Machine Checker uses Apify's pay-per-event pricing model. You are only charged for what you actually use -- no monthly fees, no subscriptions.

EventPriceDescription
Start$0.035One-time per run
URL checked$0.002Per URL checked

Cost examples:

  • Checking 10 URLs: $0.035 + (10 x $0.002) = $0.055
  • Checking 50 URLs: $0.035 + (50 x $0.002) = $0.135
  • Checking 100 URLs: $0.035 + (100 x $0.002) = $0.235
  • Checking 1,000 URLs: $0.035 + (1,000 x $0.002) = $2.035

Using the Apify API

You can call Wayback Machine Checker programmatically from any language using the Apify API. The actor slug is automation-lab/wayback-machine-checker. Below are ready-to-use examples for the two most common languages.

Node.js

import{ ApifyClient }from'apify-client';
const client =newApifyClient({token:'YOUR_TOKEN'});
const run =await client.actor('automation-lab/wayback-machine-checker').call({
urls:['https://www.google.com','https://www.wikipedia.org'],
});
const{ items }=await client.dataset(run.defaultDatasetId).listItems();
console.log(items);

Python

from apify_client import ApifyClient
client = ApifyClient('YOUR_TOKEN')
run = client.actor('automation-lab/wayback-machine-checker').call(run_input={
'urls':['https://www.google.com','https://www.wikipedia.org'],
})
items = client.dataset(run['defaultDatasetId']).list_items().items
print(items)

cURL

curl-X POST "https://api.apify.com/v2/acts/automation-lab~wayback-machine-checker/runs?token=YOUR_TOKEN"\
-H"Content-Type: application/json"\
-d'{
"urls": ["https://www.google.com", "https://www.wikipedia.org"]
}'

Use with Claude AI (MCP)

This actor is available as a tool in Claude AI through the Model Context Protocol (MCP). Add it to Claude Desktop, Cursor, Windsurf, or any MCP-compatible client.

Setup for Claude Code

$claude mcp add--transport http apify "https://mcp.apify.com?tools=automation-lab/wayback-machine-checker"

Setup for Claude Desktop, Cursor, or VS Code

Add this to your MCP config file:

{
"mcpServers":{
"apify":{
"url":"https://mcp.apify.com?tools=automation-lab/wayback-machine-checker"
}
}
}

Example prompts

  • "Check if example.com has Wayback Machine snapshots"
  • "Find archived versions of this deleted page: https://example.com/old-page"
  • "How old is this domain based on its Wayback Machine archive history?"

Learn more in the Apify MCP documentation.

Integrations

Wayback Machine Checker integrates with the major automation and data platforms through the Apify ecosystem:

  • Make (formerly Integromat) -- trigger archive checks automatically when new domains appear in your pipeline or CRM.
  • Zapier -- create Zaps that check Wayback Machine availability whenever a new URL is added to a list.
  • Google Sheets -- send results to a spreadsheet for tracking domain archive history over time.
  • Slack -- alert your team when a domain has no archive history or when a previously archived page disappears.
  • Webhooks -- post-process results in your own backend for domain valuation or research workflows.
  • n8n -- orchestrate runs from n8n workflows or any platform that supports HTTP requests and the Apify REST API.

Tips and best practices

  • Use full URLs -- include the protocol (https://) for the most accurate results from the Wayback Machine APIs.
  • Check before buying domains -- a long archive history with legitimate content is a positive signal for domain valuation and SEO potential.
  • Combine with other actors -- pair with Website Uptime Checker to see if a site is live now and how long it has been archived.
  • Schedule periodic checks -- set up a weekly schedule to monitor whether important pages continue to be archived over time.
  • Use snapshot URLs directly -- the oldestSnapshotUrl field gives you a direct link you can open in a browser to view the archived page.
  • Batch domains for due diligence -- when evaluating multiple domains for acquisition, run them all in a single batch to compare archive histories side by side.

Legality

This tool analyzes publicly accessible web content. Automated analysis of public web resources is standard practice in SEO and web development. Always respect robots.txt directives and rate limits when analyzing third-party websites. For personal data processing, ensure compliance with applicable privacy regulations.

FAQ

What if a URL has never been archived? The actor returns isAvailable: false with snapshotCount: 0 and null values for the snapshot date fields. The error field remains null because the check itself succeeded.

Does the actor create new Wayback Machine snapshots? No. It only queries existing snapshots through the Internet Archive APIs. It does not trigger new crawls or submit URLs for archiving. To request a new snapshot, use the Wayback Machine's Save Page Now feature directly.

Are there rate limits on the Wayback Machine API? The Internet Archive may throttle requests if too many are sent in a short period. The actor handles this gracefully with automatic retries and pacing to stay within acceptable limits.

Can I export results to CSV? Yes. Apify datasets support export in JSON, CSV, Excel, XML, and other formats. After the run completes, download results from the Apify Console or use the API to export in your preferred format.

Can I check subpages or just root domains? You can check any full URL, including deep subpages. The Wayback Machine archives individual pages, so https://example.com/blog/post-1 and https://example.com are tracked separately with their own snapshot histories.

What does the archiveAgeYears field represent? It is the number of years between the oldest snapshot date and the current date, calculated as a decimal. For example, 27.2 means the URL has been archived for approximately 27 years and 2 months.

The actor returns an error or zero snapshots for a URL I know was archived. What happened? The Internet Archive APIs may occasionally be slow or rate-limited. The actor retries automatically, but during heavy load the API may still time out. Try running again after a few minutes. Also ensure you are using the exact URL format -- https://www.example.com and https://example.com are tracked separately in the Wayback Machine.

The snapshot count seems very low compared to what I see on web.archive.org. Why? The CDX API used by the actor may return summarized results for very popular domains with millions of snapshots. The count is still useful for comparing relative archive depth across URLs, but for exact totals on heavily archived sites, the Wayback Machine web interface may show more detail.

Other SEO tools

You might also like

Wayback Machine CDX Bulk Extractor

automation-lab/wayback-machine-cdx-extractor

Bulk extract archived snapshot metadata from the Wayback Machine CDX API. Get every crawled URL, timestamp, HTTP status code, MIME type, and content digest for any domain or URL pattern. Export to JSON, CSV, or Excel.

๐Ÿ‘ User avatar

Stas Persiianenko

7

Internet Archive Search โ€” Wayback Machine Advanced Query Tool

maged120/archive-org-advanced-search

Search the Internet Archive (archive.org) with full advanced filter support โ€” date range, media type, language, subject, and more. Returns metadata from archived web pages, books, audio, and video.

Youtube Video Finder

coregent/youtube-video-finder

Fast YouTube video discovery tool optimized for speed and minimal data extraction. Extract 10 essential discovery fields to quickly identify relevant videos for deeper analysis. No residential proxy required.

205

Wayback Machine URL Extractor - Archived URLs

logiover/wayback-machine-url-extractor

Extract every archived URL of any domain from the Internet Archive's Wayback Machine (CDX API). Recover lost or old pages, build redirect maps and run OSINT, with date and status filters. No API key, export to CSV or JSON.

Wayback Machine Search

crawlerbros/wayback-machine-search

Query Internet Archive's Wayback Machine for historical snapshots of any URL or domain. Filter by date, HTTP status, MIME type, and deduplicate. Optionally fetch the archived page text. Free public CDX API, no authentication.

Wayback Machine Historical Content Scraper

happyfhantum/wayback-machine-historical-content-scraper

Compare archived website snapshots through the Wayback Machine and extract page-history change signals.

89

4.0