VOOZH about

URL: https://apify.com/scraperforge/google-news-scraper

⇱ Google News Scraper Β· Apify


Pricing

$19.99/month + usage

Go to Apify Store

Google News Scraper

πŸ“° Google News Scraper collects real-time headlines, publishers, snippets, dates & links from Google News. πŸ”Ž Filter by keywords, topics, country & language. πŸ“Š Export JSON/CSV, deduplicate & schedule crawls. πŸš€ Perfect for media monitoring, trend tracking & research.

Pricing

$19.99/month + usage

Rating

0.0

(0)

Developer

πŸ‘ ScraperForge

ScraperForge

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

2 months ago

Last modified

Share

Google News Scraper

Google News Scraper is a fast, reliable Google News scraping tool that collects headlines, publishers, snippets, dates, links, and images from Google News RSS β€” ideal for marketers, developers, data analysts, and researchers who need to scrape Google News at scale. It targets the Google News SERP feed, handles regions and languages, and delivers clean, structured results for media monitoring, trend tracking, and research. With async fetching, proxy fallback, and smart de-duplication, this real-time Google News scraper enables consistent Google News data extraction without manual effort.

What data / output can you get?

Below are the exact fields this Google News crawler stores for each article it collects and pushes to the Apify dataset.

Data typeDescriptionExample value
positionResult index in the current run (1-based)1
titleArticle headlineTesla announces new factory plans in Mexico
linkDirect article URL (resolved from RSS redirect where possible)https://example.com/tesla-factory-plans
domainDomain derived from source name or article URLexample.com
sourcePublisher/source name parsed from RSS entryBloomberg
dateHuman-friendly relative time computed from pubDate2 hours ago
date_utcISO 8601 UTC timestamp computed from pubDate2026-03-15T10:30:00+00:00
snippetCleaned snippet extracted from the RSS descriptionTesla is planning a new manufacturing facility in Mexico...
thumbnailBase64 data URL for a fetched article image (Open Graph/Twitter Card/inline)data:image/jpeg;base64,/9j/4AAQSkZJRgABAQ...
block_positionSame as position; maintained for compatibility1

Notes:

  • The actor de-duplicates by GUID during parsing to prevent duplicate items.
  • Thumbnails are retrieved from the article page when possible and encoded as base64; when the image cannot be determined or fetched, this field may be empty.
  • Snippets are derived by cleaning HTML from the RSS description.

Key features

  • πŸ” Bold proxy fallback workflow
    Starts without a proxy and automatically escalates to datacenter and then residential proxies on blocks or failures (with exponential backoff and retries). This boosts reliability for Google News scraping without API access.

  • 🌍 Region & language controls
    Configure Google country (gl), UI language (hl), language-limited results (lr), and country-limited results (cr) to tailor your Google News data extraction by market.

  • πŸ•’ Flexible time filtering
    Filter by last hour, day, week, month, year, or a custom date range using time_period with time_period_min/time_period_max in MM/DD/YYYY format.

  • 🧹 Clean snippets & readable dates
    HTML is stripped from RSS descriptions to produce clean snippets, while pubDate is converted to both a relative β€œtime ago” string and ISO 8601 UTC timestamp.

  • πŸ–ΌοΈ Smart thumbnail capture
    Fetches images via Open Graph or Twitter Card tags, with fallbacks to article content images. Valid images are encoded as base64 data URLs for portable use.

  • 🚦 De-duplication & multi-strategy harvesting
    Prevents duplicates via GUID tracking and augments collection by trying multiple time-range strategies (e.g., day/week/month) when no specific time period is set.

  • βš™οΈ Async performance & stability
    Built on Python asyncio + aiohttp for speed, with per-request timeouts, rate limiting between requests, and up to 3 retries per proxy level to maximize success rates.

  • πŸ“¦ Real-time dataset writes
    Items are saved incrementally during the run, so you can monitor results as they arrive and consume them from the run’s dataset stream.

How to use Google News Scraper - step by step

  1. Create or log in to your Apify account
    Access the actor from your Apify dashboard.

  2. Open Google News Scraper
    Navigate to the β€œgoogle-news-scraper” actor.

  3. Enter your input parameters
    At minimum, provide query and maxItems. Optionally add gl, hl, lr, cr, time_period (and custom dates), nfpr, filter, and proxyConfiguration.

  4. Tune filters and locale

    • Use gl (Google Country) and hl (UI Language) to localize results.
    • Use lr and/or cr to limit results by language or country.
    • Use time_period to constrain recency, including a custom date range.
  5. Control result volume & behavior

    • Set maxItems (100–5000).
    • Toggle nfpr (exclude autocorrect) and filter (Similar/Omitted Results).
  6. Start the run
    The actor fetches Google News RSS data, applies retry logic and proxy fallback as needed, and writes items to the dataset in real time.

  7. Review and download your results
    Open the run’s Dataset to view, filter, and export items as needed for your workflow.

Pro tip: For precise date windows, set time_period to custom and provide time_period_min/time_period_max in MM/DD/YYYY.

Use cases

Use case nameDescription
Media monitoring & alertsTrack breaking stories and publishers for your topics and brands with a real-time Google News scraper that saves structured articles continuously.
SEO & content planningIdentify trending topics and headlines to inform content calendars using consistent Google News headlines scraper output.
Competitive intelligenceMonitor competitors’ press coverage and announcements by filtering results with country/language parameters.
Market & financial trackingFollow sector-specific news (e.g., β€œearnings”, β€œacquisition”) with time-based filters for last day/week.
Academic & policy researchBuild structured corpora of articles for analysis using language-restricted results (lr) and region constraints (gl/cr).
Data pipelines & dashboardsUse the dataset output as a Google News API alternative to power dashboards and analytics without scraping browsers.

Why choose Google News Scraper?

This production-ready Google News scraping tool combines precision, automation, and reliability.

  • βœ… Accurate structured output with consistent fields (title, link, domain, source, date, snippet, thumbnail, positions).
  • 🌐 Multilingual and multi-region support via gl, hl, lr, and cr parameters.
  • πŸ“ˆ Scales reliably with async requests, rate limiting, and up to 3 automatic retries per proxy level.
  • πŸ§‘β€πŸ’» Developer-friendly dataset output ready for integrations and downstream processing.
  • πŸ” Safe-by-design proxy fallback (none β†’ datacenter β†’ residential) to reduce blocks and keep runs stable.
  • πŸ•’ Real-time saves to the dataset so long-running queries produce usable data immediately.
  • 🧰 More robust than browser extensions or ad‑hoc scripts β€” built with aiohttp, BeautifulSoup, and clear retry logic.

Bottom line: if you need a dependable Google News scraping without API approach, this actor delivers consistent, clean results at scale.

Is it legal / ethical to use Google News Scraper?

Yes β€” when done responsibly. The actor processes publicly accessible Google News RSS content and does not access private or authenticated data.

Guidelines for compliant use:

  • Respect platform terms and robots.txt directives.
  • Avoid abusive behavior (high request rates, excessive retries).
  • Use data for lawful purposes and follow applicable regulations (e.g., fair use).
  • Attribute original publishers when required by your use case.
  • Consult your legal team for edge cases and jurisdiction-specific requirements.

Input parameters & output format

Example JSON input

{
"query":"Tesla",
"maxItems":200,
"gl":"United States",
"hl":"English",
"lr":"English",
"cr":"United States",
"time_period":"last_week",
"time_period_min":"03/01/2026",
"time_period_max":"03/31/2026",
"nfpr":1,
"filter":1,
"proxyConfiguration":{
"useApifyProxy":false
}
}

Parameters

FieldTypeDescriptionDefaultRequired
maxItemsintegerMaximum number of search results to retrieve (100–5000 enforced)100Yes
querystringThe search term to useElon MuskYes
glstringThe Google country to use for the queryβ€”No
hlstringThe Google UI language to return resultsβ€”No
lrstringLimit the results to a specific languageβ€”No
crstringLimit the results to a specific countryβ€”No
time_periodstringTime period for results: last_hour, last_day, last_week, last_month, last_year, customβ€”No
time_period_minstringMinimum date for custom time period (MM/DD/YYYY)β€”No
time_period_maxstringMaximum date for custom time period (MM/DD/YYYY)β€”No
nfprintegerExclude results from auto-corrected queries (0 or 1)0No
filterintegerEnable/disable Similar Results and Omitted Results filters (0 or 1)1No
proxyConfigurationobjectConfigure proxy settings. The actor will start with no proxy, then fallback to datacenter, then residential proxies if needed.{"useApifyProxy": false}No

Notes:

  • If maxItems is set below 100, the actor automatically raises it to 100; above 5000, it caps at 5000.
  • For time_period="custom", both time_period_min and time_period_max must be provided in MM/DD/YYYY format.

Example JSON output

{
"position":1,
"title":"Tesla announces new factory plans in Mexico",
"link":"https://example.com/tesla-factory-plans",
"domain":"example.com",
"source":"Bloomberg",
"date":"2 hours ago",
"date_utc":"2026-03-15T10:30:00+00:00",
"snippet":"Tesla is planning a new manufacturing facility in Mexico...",
"thumbnail":"data:image/jpeg;base64,/9j/4AAQSkZJRgABAQ...",
"block_position":1
}

Field notes:

  • thumbnail may be empty if no suitable image is found or the image is not retrievable.
  • date and date_utc are derived from the RSS pubDate; if parsing fails, the actor uses fallbacks.

FAQ

Is there a free trial or free tier?

Yes. This actor includes a 120-minute trial window in its current pricing plan, so you can evaluate it before subscribing.

Does it support Google News scraping with Python?

Yes. The actor is implemented in Python using asyncio and aiohttp, and produces structured dataset items suitable for downstream Python workflows.

How many results can it collect per run?

You can request between 100 and 5000 items via maxItems. The actor enforces this range for stability and performance.

Can I filter by language and country?

Yes. Use hl (UI language), lr (language-limited results), gl (Google country), and cr (country-limited results) to localize your results.

Can I filter by time range?

Yes. Set time_period to last_hour, last_day, last_week, last_month, last_year, or custom. For custom, provide time_period_min and time_period_max in MM/DD/YYYY format.

How does proxy handling work?

The actor starts with no proxy, then automatically falls back to datacenter proxies, and finally residential proxies if blocks or errors occur. It also retries requests up to three times per proxy level with backoff.

Does it de-duplicate results?

Yes. The actor uses item GUIDs from the RSS feed to avoid saving duplicate articles during a run.

What images are returned?

The actor attempts to fetch an article thumbnail by checking Open Graph and Twitter Card tags and then scanning suitable in-page images. Valid images are returned as base64 data URLs in the thumbnail field.

Is this a Google News API alternative?

For many use cases, yes. It provides structured article data from Google News RSS that you can use in pipelines and dashboards without relying on a separate API.

What do nfpr and filter options do?

  • nfpr: Excludes results from auto-corrected queries when set to 1.
  • filter: Enables (1) or disables (0) Google’s Similar/Omitted Results filters.

Closing CTA / Final thoughts

Google News Scraper is built for accurate, scalable collection of structured Google News data. With locale controls, flexible time filters, async performance, and robust proxy fallback, it provides dependable results for marketers, developers, analysts, and researchers. Configure your query, set maxItems and filters, and start capturing real-time news signals with clean titles, links, snippets, timestamps, and thumbnails. If you’re building a Google News scraping pipeline or seeking a Google News API alternative, this actor gives you production-ready, structured output to power your apps and analysis.

You might also like

Google News Scraper

scrapepilotapi/google-news-scraper

πŸ“° Google News Scraper extracts real‑time headlines, snippets, publishers, timestamps & links from Google News by topic, keyword, region & language. ⚑ Ideal for media monitoring, PR, SEO, market research & competitive intelligence. πŸ”Ž Clean JSON/CSV output.

Google News Scraper

scrapapi/google-news-scraper

πŸ“° Google News Scraper collects headlines, snippets, sources, dates & links from Google News by topic, keyword, region & language. πŸ”Ž Export to JSON/CSV for monitoring trends, competitors & PR. ⚑ Fast, reliable, proxy-ready. πŸš€ Perfect for media research & market intel.

Google News Scraper

scrapebase/google-news-scraper

Stay on top of breaking stories with this Google News scraper πŸ“°βš‘ Extract headlines, sources, publish dates, snippets, links, and more from Google News results. Perfect for trend tracking, media monitoring, research, and content planning. Get fresh news data fast πŸš€

Google News Scraper

futurizerush/google-news-scraper

Google News Search Scraper - Real-time news aggregation from Google News. Features smart article enrichment with full content extraction. Perfect for market research, trend analysis, and content monitoring.

Google News Scraper

oneary/google-news-scraper

πŸ“° Extract the latest Google News articles by keyword β€” get headlines, publishers, snippets, and timestamps in real time. Perfect for media monitoring, brand tracking, and news aggregation. Filter by location and language to surface stories that matter to your audience. No coding, no RSS...

Real-Time Google News Scraper (Keywords + Topics + AI-ready)

ahmed_jasarevic/google-news-actor

Extract structured, real-time news data from Google News using keywords or topic-based scraping.

πŸ‘ User avatar

Ahmed Jasarevic

2

Google News Scraper

piotrv1001/google-news-scraper

Scrapes news articles from Google News, extracting titles, sources, publication dates, and links. Search by keywords, browse by topic, or get top headlines with multi-language and region support. Ideal for news monitoring, media analysis, and content aggregation.