VOOZH about

URL: https://apify.com/sidjain/apify-webscrap

⇱ AI Web Scraper - Webscraper with AI based Summery or answer Β· Apify


πŸ‘ AI Web Scraper - Webscraper with AI based Summery or answer avatar

AI Web Scraper - Webscraper with AI based Summery or answer

Under maintenance

Pricing

Pay per usage

Go to Apify Store

AI Web Scraper - Webscraper with AI based Summery or answer

Under maintenance

Web Page Scraper + AI Summary/Answer: Scrapes any URL, extracts content (text, links, images, tables, lists,raw html,tech stack), auto-falls back to headless browser for JS sites, and optionally generates an AI summary/answer from your prompt. Try with frontend at-https://aiscraperweb.netlify.app/

Pricing

Pay per usage

Rating

0.0

(0)

Developer

πŸ‘ Siddharth Jain

Siddharth Jain

Maintained by Community

Actor stats

0

Bookmarked

46

Total users

3

Monthly active users

6 months ago

Last modified

Share

Web Scraper AI (Apify Actor)

Scrape a web page (HTTP first, browser fallback for JS-heavy pages) and optionally generate an AI summary via Pollinations.

What it does

  • Validates the input URL (http / https only)
  • Tries a fast HTTP scrape first (Axios + Cheerio)
  • If the page looks blocked / empty / JS-rendered, falls back to a headless browser scrape (Puppeteer + @sparticuz/chromium)
  • Optionally calls Pollinations (https://text.pollinations.ai/) to produce a summary using your prompt

Input

This Actor expects a JSON input with:

  • url (required): page URL to scrape
  • prompt (optional): if provided, the Actor will request an AI summary

Example:

{
"url":"https://example.com",
"prompt":"Summarize the page in 5 bullets"
}

Output

The Actor writes results to:

  • Default dataset (one item per run)
  • Key-value store as OUTPUT

Fields include:

  • url, methodUsed, scrapedAt
  • title, description, paragraphs, images, links
  • tables, lists, uniqueComponents, rawHTML, techStack
  • summary/ai answer (only if prompt is provided)

Run locally (Windows)

Install deps:

$npminstall

Quick smoke test:

$npm run test:local

Run with your own URL:

npm run set-input -- --url https://example.com --prompt"Summarize this page"
npm start

Where to find output:

  • storage/datasets/default/000000001.json
  • storage/key_value_stores/default/OUTPUT.json

Deploy / host on Apify

Option A: Apify Console (UI) β€” easiest

  1. Zip the project (or connect Git repo)
  2. In Apify Console β†’ Actors β†’ Create new β†’ Source code
  3. Upload the code
  4. Build the Actor image
  5. Run it with an input JSON (see Input section)

Option B: Apify CLI

If you use the Apify CLI:

  1. Install/login:
npm i -g apify-cli
apify login
  1. From this project folder:
$apify push
  1. Then run it from Apify Console or via CLI.

Will I get the β€œform layout” like other Actors?

Yes.

Apify shows an input form UI automatically when your Actor provides an input schema. This project includes:

  • INPUT_SCHEMA.json (defines the fields)
  • actor.json references it via the input property

That’s what generates the β€œevery actor has to fill details” form.

If you want more fields (proxy, cookies, max pages, etc.), you extend INPUT_SCHEMA.json and update the code to read them.

Notes / limitations

  • Some sites block scraping (bot protection, captchas, login walls). In those cases, the Actor may return BLOCKED / LOGIN_REQUIRED.
  • AI summary depends on Pollinations availability/rate limits.

You might also like

πŸ§ͺHigh-Volume Website Content & Media Scraper

caring_dizi/blog-content-scraper-fixed

πŸ§ͺCrawling Done Right! Let me now what you think, what or where or how i can improve my actor, and i am all for constructive criticism. So please message if you have any questions. Enjoy and have a good day.

πŸ‘ User avatar

Jeff Halverson

148

5.0

Grant & Foundation Opportunities Scraper

scrapepilot/grant-foundation-opportunities-scraper

Scrape grant and funding opportunities from grants.gov, fundsforngos.org, and any grant portal. Extracts 6 fields: grant_id, funder, amount, eligibility, deadline, and link. Exports to JSON, CSV, and Excel. Enable Demo Mode to preview 10 sample records instantly β€” no scraping needed.

Company Employees Scraper

build_matrix/company-employees-scraper

Fetch all employees from a company.

803

4.3

Perplexity Search Answers Scraper

searchapi/perplexity-search-answers-scraper

Scrapes AI-generated answer content from Perplexity (perplexity.ai/search). Extracts the full canonical answer-vertical schema: query, full answer text, answer HTML, model used, citation count, related questions, answer sections, and timestamp.

SAM.gov Scraper - Contracts, Exclusions & Grants

jungle_synthesizer/samgov-scraper

Scrape SAM.gov for federal contract opportunities, exclusion records (debarment list), wage determinations, and assistance listings. No API key required. Filter by keyword, NAICS code, opportunity type, set-aside category, agency, and state.

πŸ‘ User avatar

BowTiedRaccoon

141

1.0

Company Detail Scraper for LinkedIn (No Cookies)

apimaestro/linkedin-company-detail

Extract detailed LinkedIn company data instantly. Get company overview, employee count, locations, funding info, and more. Perfect for market research, lead generation, and competitor analysis. Clean, structured data ready for your business needs.

4.4K

3.2

Google AI Mode Scraper β€” Generative Answers

scrape.badger/google-ai-mode-scraper

Scrape Google AI Mode (generative answer responses from udm=50): structured text blocks, citations with links and titles. Accepts one or more queries. Ideal for building AI-vs-AI comparisons, answer-engine monitoring, and SEO research on how Google's AI summarises topics.

14

SAM.gov Scraper - Federal Contracts & RFP Monitor (No API Key)

omarchydev/government-contract-monitor

Scrape SAM.gov + USASpending for federal RFPs, solicitations, and contract awards. No API key required. AI-ranked by relevance to your business. Filter by NAICS, set-aside, agency, state, value. Attachment URLs included. For govcon BD teams, proposal writers, and competitive intelligence.

Tester MCP Client

jiri.spilka/tester-mcp-client

A model context protocol (MCP) client that connects to any MCP server using Streamable HTTP and displays the conversation in a chat-like UI. It is a standalone Actor server designed for testing MCP servers over Stremable HTTP.

πŸ‘ User avatar

JiΕ™Γ­ Spilka

1.4K

5.0

Related articles

Top 100+ AI influencers to follow on Instagram [2026]
Read more
What is AI web scraping? And do you really need it?
Read more