VOOZH about

URL: https://apify.com/nexgendata/google-cache-viewer

โ‡ฑ ๐Ÿ—‚๏ธ Google Cache Viewer โ€” Wayback & Archive Lookup ยท Apify


๐Ÿ‘ ๐Ÿ—‚๏ธ Google Cache Viewer โ€” Wayback + Archive Alternative avatar

๐Ÿ—‚๏ธ Google Cache Viewer โ€” Wayback + Archive Alternative

Pricing

Pay per event

Go to Apify Store

๐Ÿ—‚๏ธ Google Cache Viewer โ€” Wayback + Archive Alternative

Replaces Google's cached-page view (killed Feb 2024). Queries Wayback Machine + archive.today, returns latest snapshot URL, timestamp, and extracted text content.

Pricing

Pay per event

Rating

0.0

(0)

Developer

๐Ÿ‘ NexGenData

NexGenData

Maintained by Community

Actor stats

0

Bookmarked

10

Total users

2

Monthly active users

11 days ago

Last modified

Share

Google killed their cache view in February 2024. The cache: search operator, the "Cached" link in search results, and the webcache.googleusercontent.com subdomain โ€” all gone. For two decades, Google Cache was how the web retrieved temporarily-dead pages, saw what a site looked like a week ago, and debugged deployment issues.

This actor replaces it with a drop-in lookup against Internet Archive Wayback Machine + archive.today, returning the freshest available snapshot for any URL.

What it does

For every URL you provide, the actor:

  1. Queries Wayback Machine's public "closest snapshot" API
  2. Queries archive.today's /newest/ endpoint (follows redirect chain)
  3. Returns the freshest available snapshot with URL, ISO timestamp, and source
  4. Optionally fetches the snapshot HTML and extracts title + 8K char text content
  5. Emits a stable content hash for change detection

Example

import requests
r = requests.post(
"https://api.apify.com/v2/acts/nexgendata~google-cache-viewer/run-sync-get-dataset-items?token="+ APIFY_TOKEN,
json={
"urls":[
"https://example.com/blog/post-now-deleted",
"https://techcrunch.com/2023/01/01/some-article"
],
"fetchContent":True
}
)
for item in r.json():
if item["found"]:
print(f"{item['url']}")
print(f" Archived: {item['latest_timestamp']} via {item['source']}")
print(f" Title: {item['content_title']}")
print(f" Preview: {item['content_text'][:200]}...")
else:
print(f"{item['url']} โ€” NOT ARCHIVED")

Sample output:

https://example.com/blog/post-now-deleted
Archived:2023-11-04T08:22:17Z via wayback
Title: How We Scaled to 10M Users
Preview: When we hit 10 million monthly users last fall, we learned...

cURL

curl-X POST "https://api.apify.com/v2/acts/nexgendata~google-cache-viewer/run-sync-get-dataset-items?token=$APIFY_TOKEN"\
-H"Content-Type: application/json"\
-d'{"urls":["https://example.com/"],"fetchContent":true}'

Why this replaces Google Cache

Google Cache (dead)This actor
StatusShut down Feb 2024Active
Accesscache:URL operator / webcache.googleusercontent.comHTTPS API
FreshnessLast Google crawlLast Wayback/archive.today snapshot (minutes to months)
Bulk modeManual, one URL at a time200 URLs/run
Text extractionโŒ (raw HTML)โœ… (8K char cleaned text)
Machine-readableโŒโœ… (JSON)
CostFree$0.003 per URL

Common use cases

  • Dead-link recovery โ€” find the last archived version of a page that 404'd
  • SEO audits โ€” see what a competitor's page used to say before they rewrote it
  • Journalism / OSINT โ€” pull the text of pages that were deleted after publication
  • Legal / compliance โ€” document what a contract/terms page said on a given date
  • Content monitoring โ€” track if an important page changed (via content_hash)
  • Affiliate link repair โ€” bulk lookup of product pages that were removed

Pricing

  • $0.005 per run (startup)
  • $0.003 per URL looked up (includes content extraction when requested)

100 URLs with content extraction = $0.305. Cheaper than Screaming Frog's archive plugin and no subscription.

FAQ

Q: Does archive.today always have the page? A: Not always. Wayback is broader; archive.today often has freshness Wayback doesn't. The actor queries both and returns the fresher of the two.

Q: What if neither has it? A: Returns found: false. Can't conjure pages that were never archived.

Q: Does this trigger a new archive capture? A: No โ€” read-only. To create a fresh capture, use Wayback's Save Page Now endpoint separately (your request, not ours).

Q: Rate limits? A: Wayback rate-limits shared usage at about 1 request/second per IP. This actor paces accordingly โ€” expect ~1 URL/second.

Q: How old can snapshots be? A: Wayback has archives dating to 1996. For any URL with a public history, you'll likely find something.

Related tools

Try it

๐Ÿ—‚๏ธ Google Cache Viewer on Apify

New to Apify? Get free platform credits.

๐Ÿ’ป Code Example โ€” Python

from apify_client import ApifyClient
client = ApifyClient("YOUR_APIFY_TOKEN")
run = client.actor("nexgendata/google-cache-viewer").call(run_input={
# Fill in the input shape from the actor's input_schema
})
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
print(item)

๐ŸŒ Code Example โ€” cURL

curl-X POST "https://api.apify.com/v2/acts/nexgendata~google-cache-viewer/run-sync-get-dataset-items?token=YOUR_TOKEN"\
-H"Content-Type: application/json"\
-d'{ /* input schema */ }'

โ“ FAQ

Q: How do I get started? Sign up at apify.com, grab your API token from Settings โ†’ Integrations, and run the actor via the Apify console, API, Python SDK, or any integration (Zapier, Make.com, n8n).

Q: What's the typical cost per run? See the pricing section below. Most runs finish under $0.10 for typical batches.

Q: Is this actor maintained? Yes. NexGenData maintains 165+ Apify actors and ships updates regularly. Bug reports via the Apify console issues tab get responses within 24 hours.

Q: Can I use the output commercially? Yes โ€” you own the output data. Check the target site's Terms of Service for any usage restrictions on the scraped content itself.

Q: How do I handle rate limits? Apify manages concurrency and retries automatically. For very large batches (10K+ items), run multiple smaller jobs in parallel instead of one mega-job for better reliability.

๐Ÿ’ฐ Pricing

Pay-per-event pricing โ€” you only pay for what you actually extract.

  • Actor Start: $0.0001
  • result: $0.0050

๐Ÿ”— Related NexGenData Actors

๐Ÿš€ Apify Affiliate Program

New to Apify? Sign up with our referral link โ€” you get free platform credits on signup, and you help fund the maintenance of this actor fleet.

๐Ÿ“š More From NexGenData

Explore the full catalog, tutorials, Gumroad data packs, and newsletter at thenextgennexus.com โ€” the brand home for everything we ship.

  • ๐Ÿ“– Tutorials & how-to guides
  • ๐Ÿ—‚๏ธ Full actor catalog with usage examples
  • ๐Ÿ“ฆ Gumroad data packs (one-time purchases)
  • ๐Ÿ“ฌ Newsletter โ€” monthly drops of new actors and revenue experiments

Built and maintained by NexGenData โ€” 165+ actors covering scraping, enrichment, MCP servers, and automation. ๐Ÿ  Home: thenextgennexus.com


Why Google Cache Viewer Beats the Wayback Machine, Archive.today, Bing Cache & Cachedview.com

FeatureNexGenData Google Cache ViewerInternet Archive WaybackArchive.todayBing CacheCachedview.com
Cost$0.002 per URL, pay-per-eventFree (rate-limited, slow)Free (rate-limited)Removed by MicrosoftWeb-only (no API)
Bulk inputThousands per runOne per requestOne per requestRIPOne per page
Google cache fallbackYes โ€” webcache.googleusercontent.com while it lastedNoNoRIPWas the whole product
Wayback fallbackYes โ€” closest-to-target-date snapshotYes (it IS Wayback)PartialRIPNo
Archive.today fallbackYesNoYes (it IS them)RIPNo
Structured outputYes โ€” JSON with source + retrieved text + timestampHTML onlyHTML onlyRIPHTML only
Schedule + webhookNativeNoneNoneRIPNone
Monthly minimumNoneNoneDonationsRIPNone
AuthApify tokenNoneNoneRIPNone

Google deprecated webcache.googleusercontent.com in 2024. Bing Cache was removed years earlier. This actor stitches together every remaining cache source (Wayback Machine, Archive.today, archive.org search) into one bulk pipeline that returns the closest snapshot to a target date, plus extracted text, plus the source URL of the archived copy โ€” so SEO teams, journalists, and OSINT researchers stop manually pasting URLs into half a dozen archive sites.

Related NexGenData Infrastructure Actors

Use caseActor
Google CSE search API replacementgoogle-cse-replacement
goo.gl short URL resolver via Waybackgoo-gl-resolver
Alexa Rank replacement (site traffic)alexa-rank-replacement
Wappalyzer / BuiltWith tech-stack detectorwappalyzer-replacement
Lighthouse + Core Web Vitals auditorpage-speed-analyzer
WCAG 2.2 accessibility auditorwcag-accessibility-auditor
Bulk DNS A / MX / NS / TXT / CAA recordsdns-records-lookup
WHOIS / RDAP replacement (any TLD)whois-replacement
Bulk IP-to-country / city / ISP / ASNip-geolocation-replacement
Web scraping MCP for AI agentsweb-scraping-mcp-server

Browse the full NexGenData catalog of 260+ actors at https://apify.com/nexgendata?fpr=2ayu9b

You might also like

Wayback Machine Scraper

gio21/wayback-machine-scraper

List Internet Archive Wayback Machine snapshots for one or more URLs. Returns timestamp, snapshot URL, HTTP status, MIME type, digest. Useful for tracking website changes over time, OSINT research, content recovery, and brand monitoring.

Wayback Machine Snapshots Scraper โ€” Internet Archive History

seemuapps/wayback-machine-snapshots-scraper

List every Internet Archive snapshot of a URL, page, or whole domain. Timestamp, snapshot URL, status code, mime type, content length. No login.

Wayback Machine Checker

automation-lab/wayback-machine-checker

This actor checks if URLs are archived in the Internet Archive Wayback Machine. It retrieves snapshot counts, oldest and newest archive dates, and direct links to archived versions. Uses both the Availability API and CDX API for comprehensive results.

๐Ÿ‘ User avatar

Stas Persiianenko

41

Wayback Machine Search

crawlerbros/wayback-machine-search

Query Internet Archive's Wayback Machine for historical snapshots of any URL or domain. Filter by date, HTTP status, MIME type, and deduplicate. Optionally fetch the archived page text. Free public CDX API, no authentication.

Websites Archiver (Wayback Machine)

web.harvester/websites-archiver

Effortlessly archive any website with our Automated Website Archiving Tool. It leverages the power of the Wayback Machine at web.archive.org to ensure your sites are preserved for future reference.

88

5.0

Wayback Machine Historical Content Scraper

happyfhantum/wayback-machine-historical-content-scraper

Compare archived website snapshots through the Wayback Machine and extract page-history change signals.

89

4.0

Wayback Machine Archive Scraper

andok/wayback-machine-scraper

Fetch historical snapshots of any webpage from the Internet Archive. Perfect for digital forensics and tracking deleted content.

๐Ÿ•ฐ๏ธ Wayback Machine Bulk Checker

taroyamada/wayback-machine-checker

Check bulk lists of URLs against the Internet Archive database to instantly verify cache availability. Automate historical web page discovery for large sites.