👁 Websites Archiver (Wayback Machine) avatar

Websites Archiver (Wayback Machine)

Pricing

$9.00/month + usage

Websites Archiver (Wayback Machine)

Effortlessly archive any website with our Automated Website Archiving Tool. It leverages the power of the Wayback Machine at web.archive.org to ensure your sites are preserved for future reference.

Pricing

$9.00/month + usage

Rating

5.0

(1)

Developer

👁 Web Harvester

Web Harvester

Maintained by Community

Actor stats

Bookmarked

Total users

Monthly active users

6 months ago

Last modified

Website Archiver (Wayback Machine)

Effortlessly archive any website with our Automated Website Archiving Tool. It leverages the power of the Wayback Machine at web.archive.org to ensure your sites are preserved for future reference.

Usage

The actor accepts an input in the following format:

{
"startUrls":[
{
"url":"https://crawlee.dev"
}
],
"fastArchiveMode":true,
"archiveErrorPages":true,
"storeArchivedResources":false
}

Input Options

Option	Type	Default	Description
`startUrls`	array	required	List of URLs to archive
`fastArchiveMode`	boolean	`true`	When enabled, sends archive request without waiting for full completion. Faster but provides less detailed output.
`archiveErrorPages`	boolean	`true`	Whether to archive pages that return HTTP 4xx and 5xx status codes
`storeArchivedResources`	boolean	`false`	Whether to include the list of archived resources in the output (only available in full mode)

Output

Full Archive Mode (`fastArchiveMode: false`)

{
"url":"https://crawlee.dev",
"archivedUrl":"https://web.archive.org/web/20240610223756/https://crawlee.dev/",
"archived":true,
"archivedAt":"2024-06-10T22:38:15.643Z",
"archivedResourcesCount":69,
"archivedResources":[
"https://crawlee.dev/",
"https://crawlee.dev/js/custom.js",
"https://crawlee.dev/assets/css/styles.5a93fba9.css"
]
}

Fast Archive Mode (`fastArchiveMode: true`)

{
"url":"https://crawlee.dev",
"archivedUrl":"https://web.archive.org/web/20240610223756/https://crawlee.dev/",
"archived":true,
"archivedAt":"2024-06-10T22:38:15.643Z"
}

Failed Archive

{
"url":"https://example.com/blocked",
"archivedUrl":null,
"note":"This URL has been excluded from the Wayback Machine",
"archived":false
}

Running the Actor

To run the actor, you'll need to have an Apify account. Once you're logged in, you can run the actor from the Apify Console. You can also use the Apify API to run the actor programmatically.

For more information on how to use Apify Actors, please refer to the Apify documentation.

Wayback Machine Scraper

glassventures/wayback-machine-scraper

Scrape Wayback Machine archive snapshots for any URL or domain. Get archived URLs, timestamps, status codes, MIME types. Export to JSON, CSV, Excel.

👁 User avatar

Glass Ventures

👁 Wayback Machine Historical Content Scraper avatar

Wayback Machine Historical Content Scraper

happyfhantum/wayback-machine-historical-content-scraper

Compare archived website snapshots through the Wayback Machine and extract page-history change signals.

👁 User avatar

Kelsey Todd

4.0

👁 Wayback Machine Checker avatar

Wayback Machine Checker

automation-lab/wayback-machine-checker

This actor checks if URLs are archived in the Internet Archive Wayback Machine. It retrieves snapshot counts, oldest and newest archive dates, and direct links to archived versions. Uses both the Availability API and CDX API for comprehensive results.

👁 User avatar

Stas Persiianenko

Internet Archive & Wayback Machine Scraper

cloud9_ai/internet-archive-scraper

Search Internet Archive and check Wayback Machine snapshots. Access 800B+ archived pages, books, movies, audio. Search items, get metadata, or check URL archive history. No API key needed. For SEO, OSINT, legal, and research.

👁 User avatar

cloud9

👁 Wayback Machine Scraper - Track Website Changes Over Time avatar

Wayback Machine Scraper - Track Website Changes Over Time

ryanclinton/wayback-machine-search

Search the Internet Archive's Wayback Machine for historical snapshots of any website. Retrieve archived page metadata -- including timestamps, URLs, MIME types, HTTP status codes, and content hashes -- for up to 10,000 snapshots per run.

👁 User avatar

Ryan Clinton

👁 Wayback Machine Search avatar

Wayback Machine Search

maximedupre/wayback-machine-search

Search Wayback Machine snapshots for URLs, hosts, and domains. Export archive dates, status codes, MIME types, digests, content text, version timelines, reports, and monitoring alerts.

👁 User avatar

Maxime Dupré

👁 Wayback Machine Scraper avatar

Wayback Machine Scraper

gio21/wayback-machine-scraper

List Internet Archive Wayback Machine snapshots for one or more URLs. Returns timestamp, snapshot URL, HTTP status, MIME type, digest. Useful for tracking website changes over time, OSINT research, content recovery, and brand monitoring.

👁 User avatar

Gio

👁 Wayback Machine Search avatar

Wayback Machine Search

crawlerbros/wayback-machine-search

Query Internet Archive's Wayback Machine for historical snapshots of any URL or domain. Filter by date, HTTP status, MIME type, and deduplicate. Optionally fetch the archived page text. Free public CDX API, no authentication.

👁 User avatar

Crawler Bros

👁 Internet Archive Search — Wayback Machine Advanced Query Tool avatar

Internet Archive Search — Wayback Machine Advanced Query Tool

maged120/archive-org-advanced-search

Search the Internet Archive (archive.org) with full advanced filter support — date range, media type, language, subject, and more. Returns metadata from archived web pages, books, audio, and video.

👁 User avatar

Maged

👁 Wayback Machine Archive Scraper avatar

Wayback Machine Archive Scraper

andok/wayback-machine-scraper

Fetch historical snapshots of any webpage from the Internet Archive. Perfect for digital forensics and tracking deleted content.

👁 User avatar

Andok

👁 Blog article image

How to use web scraping for online research

👁 Blog article image

Python and machine learning

👁 Blog article image

Pros and cons of web scraping

URL: https://apify.com/web.harvester/websites-archiver