VOOZH about

URL: https://apify.com/gio21/wayback-machine-scraper

โ‡ฑ Wayback Machine Scraper ยท Apify


Pricing

from $1.00 / 1,000 snapshot scrapeds

Go to Apify Store

Wayback Machine Scraper

List Internet Archive Wayback Machine snapshots for one or more URLs. Returns timestamp, snapshot URL, HTTP status, MIME type, digest. Useful for tracking website changes over time, OSINT research, content recovery, and brand monitoring.

Pricing

from $1.00 / 1,000 snapshot scrapeds

Rating

0.0

(0)

Developer

๐Ÿ‘ Gio

Gio

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

24 days ago

Last modified

Share

List Internet Archive Wayback Machine snapshots for one or more URLs. Uses the CDX server API.

Returns timestamp, snapshot URL, HTTP status, MIME type, digest, byte length.

Useful for tracking website changes over time, OSINT research, content recovery, brand monitoring, link rot studies.

Free vs. paid

  • Free plan: mock records for each URL.
  • Paid plan: real, live Wayback Machine data.

Input

FieldTypeDescription
urlsArray (required)List of URLs to look up.
fromStringStart date filter (YYYY, YYYYMMDD, or YYYYMMDDhhmmss).
toStringEnd date filter.
maxSnapshotsPerUrlIntegerDefault 50, max 1000.
debugBooleanVerbose logs.

Output

{
"url":"apify.com",
"timestamp":"20210105141317",
"snapshotUrl":"https://web.archive.org/web/20210105141317/apify.com",
"originalUrl":"https://apify.com/",
"statusCode":"200",
"mimeType":"text/html",
"digest":"QPBSADYPYQEHJ4NTAXNCLN7QHFFROZHU",
"length":158034
}

Pricing

$0.001/snapshot. 1,000 snapshots = $1.

Limitations

  • Wayback Machine's CDX server has soft rate limits (~1 req/sec). The actor adds 400ms between URL queries.
  • For very popular URLs, the number of snapshots can be massive (millions). Use from/to to scope.

If this actor helped you, please leave a review on the Apify Store.

You might also like

Wayback Machine Scraper - Track Website Changes Over Time

ryanclinton/wayback-machine-search

Search the Internet Archive's Wayback Machine for historical snapshots of any website. Retrieve archived page metadata -- including timestamps, URLs, MIME types, HTTP status codes, and content hashes -- for up to 10,000 snapshots per run.

70

Wayback Machine Snapshots Scraper โ€” Internet Archive History

seemuapps/wayback-machine-snapshots-scraper

List every Internet Archive snapshot of a URL, page, or whole domain. Timestamp, snapshot URL, status code, mime type, content length. No login.

Wayback Machine Search

crawlerbros/wayback-machine-search

Query Internet Archive's Wayback Machine for historical snapshots of any URL or domain. Filter by date, HTTP status, MIME type, and deduplicate. Optionally fetch the archived page text. Free public CDX API, no authentication.

Wayback Machine CDX Bulk Extractor

automation-lab/wayback-machine-cdx-extractor

Bulk extract archived snapshot metadata from the Wayback Machine CDX API. Get every crawled URL, timestamp, HTTP status code, MIME type, and content digest for any domain or URL pattern. Export to JSON, CSV, or Excel.

๐Ÿ‘ User avatar

Stas Persiianenko

7

Wayback Machine Historical Content Scraper

happyfhantum/wayback-machine-historical-content-scraper

Compare archived website snapshots through the Wayback Machine and extract page-history change signals.

89

4.0

Wayback Machine Bulk Lookup

jungle_synthesizer/wayback-machine-bulk-lookup

Look up Wayback Machine snapshots for any URL or list of URLs. Returns capture timeline, optional snapshot markdown, and live-vs-snapshot diff. Date range filtering, capture limit, bulk input. Built for OSINT, journalism, SEO link-rot recovery, and legal evidence.

๐Ÿ‘ User avatar

BowTiedRaccoon

2