VOOZH about

URL: https://apify.com/nekohaii/philippine-news-scraper

โ‡ฑ PH News API - Multi-Source RSS Aggregator ยท Apify


๐Ÿ‘ PH News API - Multi-Source RSS Aggregator avatar

PH News API - Multi-Source RSS Aggregator

Pricing

from $5.00 / 1,000 article scrapeds

Go to Apify Store

PH News API - Multi-Source RSS Aggregator

Aggregate Philippine news from PhilStar, BusinessWorld, and Rappler via RSS. Full article text, excerpts, categories, author info, and metadata. Supports keyword filtering and per-source limits.

Pricing

from $5.00 / 1,000 article scrapeds

Rating

0.0

(0)

Developer

๐Ÿ‘ Joey Del Rosario

Joey Del Rosario

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

6 days ago

Last modified

Categories

Share

PH News API โ€” Full-Text Article Scraper

Get complete Philippine news articles, not just headlines. This scraper fetches RSS feeds from 5+ Philippine publications, then extracts the full body text from every article page โ€” ads, navigation, and sidebars removed.

Output is clean JSON with title, URL, full body text, publication date, author, categories, and more. Ready for AI pipelines, NLP, databases, or dashboards.

What you get

  • Full article body text โ€” Not just RSS summaries. Each article URL is visited and the main content extracted via trafilatura.
  • 5 publications (honest count): PhilStar (5 sections), Rappler, SunStar, Daily Tribune. BusinessWorld kept for auto-recovery. Plus 5 additional sources attempted.
  • 100+ articles per run across all sources
  • Keyword filtering โ€” Only return articles matching your topic
  • ISO 8601 dates โ€” UTC-normalized, query-ready

Output sample

{
"title":"Senate approves 2026 national budget on final reading",
"source":"PhilStar",
"section":"headlines",
"url":"https://www.philstar.com/headlines/2026/...",
"published_date":"2026-06-13T09:04:00+00:00",
"author":"Kristine Daguno-Bersamina",
"body":"MANILA, Philippines โ€” The Senate on Friday approved on third and final reading the proposed P6.352-trillion national budget for 2026...",
"summary":"The Senate has approved the 2026 national budget on final reading...",
"categories":["Senate","national budget","2026"],
"scraped_at":"2026-06-13T12:00:00+00:00"
}

How it works

  1. Fetch RSS/Atom feeds from each publication
  2. Normalize to common schema (title, date, author, categories, image)
  3. For every article URL, fetch the page and extract clean body text
  4. Sort by date (newest first) and push to dataset

Full-text extraction is powered by trafilatura, a Python library that extracts the main content from news articles while removing boilerplate.

Pricing

$5.00 per 1,000 articles ($0.005 each). A typical run of 100 articles costs $0.50. Full-text extraction is included at no extra charge.

Sources

PublicationStatusFeeds
PhilStarWorkingheadlines, nation, opinion, business, world
RapplerWorkingall articles
SunStarWorkingall articles
Daily TribuneWorkingall articles
BusinessWorldBlocked (403)kept for auto-recovery
Manila BulletinAttemptedRSS feed may be blocked
Manila TimesAttemptedRSS feed may be blocked
InquirerAttemptedRSS feed may be blocked
ABS-CBN NewsAttemptedRSS feed may be blocked
GMA NewsAttemptedRSS feed may be blocked

Input parameters

ParameterTypeDefaultDescription
sourcesstringallComma-separated: all, philstar, rappler, sunstar, tribune, businessworld, manila-bulletin, manila-times, inquirer, abs-cbn, gma-news
keywordstringemptyFilter articles by keyword in title/summary
extractFullTextbooleantrueEnable/disable full-text body extraction
maxItemsinteger100Maximum articles total (1-1000)
maxPerSourceinteger20Maximum articles per individual RSS feed (1-200)

You might also like

RSS News Aggregator

louvre/rss-news-aggregator

Aggregate dozens of RSS and Atom feeds into one chronologically sorted JSON stream in seconds. Returns title, link, publish date, author, categories, and media per article, grouped by source domain. Use for news monitoring, content curation, and LLM/AI feed pipelines.

Multi-Source News & Content Scraper

moving_beacon-owner1/rss-feeds----multi-source-news-content-scraper

Multi-Source News & Content Scraper. Aggregates articles from multiple RSS/Atom feeds simultaneously. Includes 60+ pre-built news source presets and supports custom feed URLs. No API key required.

7

Fast News Content Scraper

datapilot/fast-news-content-scraper

Fast News Content Scraper Actor collects news articles using Fast News RSS and . It extracts title, URL, publish date, author, description, and full article text. Supports multiple queries, anti-bot delays, and outputs structured JSON with source site and scrape timestamp.

Google News Scraper

gio21/google-news-scraper

Scrapes Google News RSS for articles by keyword (title, link, source, pubDate, description). Pay per article.

Google News Scraper โ€” Search & Topics RSS API

bovi/google-news-scraper

Scrape Google News by keyword or topic. Returns title, source, published date, direct RSS link, and parse_confidence. No proxy needed. Pay per result.

๐Ÿ‘ User avatar

Vitalii Bondarev

2

Google News RSS Intelligence

smart_tech_resources/google-news-rss

Collect Google News RSS results without an API key and output normalized, AI-ready news data for monitoring, research, and reporting.

๐Ÿ‘ User avatar

Smart Tech Resources

1