VOOZH about

URL: https://apify.com/joyouscam35875/rss-news-aggregator

โ‡ฑ RSS & News Feed Aggregator โ€” Multi-Source Article Scraper ยท Apify


๐Ÿ‘ RSS & News Feed Aggregator โ€” Multi-Source Article Scraper avatar

RSS & News Feed Aggregator โ€” Multi-Source Article Scraper

Pricing

Pay per usage

Go to Apify Store

RSS & News Feed Aggregator โ€” Multi-Source Article Scraper

Aggregate and parse RSS/Atom feeds from any source. Extract articles with titles, descriptions, authors, dates, images. Optionally fetch full article content. Perfect for news monitoring and AI pipelines. $0.0005/article.

Pricing

Pay per usage

Rating

0.0

(0)

Developer

๐Ÿ‘ Ken Digital

Ken Digital

Maintained by Community

Actor stats

0

Bookmarked

22

Total users

8

Monthly active users

2 months ago

Last modified

Categories

Share

Aggregate and parse multiple RSS 2.0 and Atom feeds into clean, structured data. Built for monitoring, curation, and intelligence workflows.

Features

  • Multi-format support โ€” RSS 2.0, RSS 1.0 (RDF), and Atom feeds
  • Namespace handling โ€” Parses media:, dc:, content:encoded, and standard Atom namespaces
  • Full content extraction โ€” Optionally follows article links and strips HTML to plain text
  • Robust parsing โ€” Handles encoding issues, CDATA blocks, malformed dates, and missing fields
  • Structured output โ€” Consistent schema across all feed types

Output Schema

Each article in the dataset contains:

FieldTypeDescription
feedUrlstringSource feed URL
feedTitlestringFeed/channel title
titlestringArticle title
linkstringArticle permalink
descriptionstringArticle summary (HTML stripped, max 5000 chars)
authorstringAuthor name
publishedDatestringISO 8601 publication date
categoriesstring[]Tags/categories from the feed
imageUrlstringThumbnail or featured image URL
guidstringUnique identifier (GUID or permalink)
fullContentstringFull article text (only when fetchFullContent is enabled)

Input Parameters

{
"feedUrls":[
"https://feeds.bbci.co.uk/news/rss.xml",
"https://rss.nytimes.com/services/xml/rss/index.xml",
"https://hnrss.org/frontpage"
],
"maxArticles":100,
"fetchFullContent":false
}
  • feedUrls (required) โ€” Array of RSS/Atom feed URLs to aggregate
  • maxArticles (default: 100) โ€” Maximum articles to output. Set to 0 for unlimited.
  • fetchFullContent (default: false) โ€” Follow article links and extract full text content

Use Cases

๐Ÿ“ก News Monitoring & Media Intelligence

Track coverage across dozens of news outlets. Monitor specific topics by aggregating topic-specific RSS feeds from major publishers. Feed results into sentiment analysis or trend detection pipelines.

๐Ÿ“‹ Content Curation & Newsletters

Aggregate content from niche blogs, industry publications, and thought leaders into a single dataset. Use as the data source for automated newsletter generation or content recommendation systems.

๐Ÿ” Competitive Intelligence

Subscribe to competitor blogs, press release feeds, and industry news. Get structured alerts when new content is published. Combine with keyword filtering for targeted monitoring.

๐Ÿ“Š Research & Dataset Building

Build timestamped article datasets for NLP research, media studies, or training data collection. The consistent schema makes downstream processing straightforward.

๐Ÿค– AI Pipeline Input

Use as a data source for LLM-powered summarization, classification, or knowledge base updates. The structured output integrates directly with vector databases and RAG pipelines.

โฐ Scheduled Monitoring

Run on a schedule (hourly, daily) with Apify's scheduling feature. Combine with deduplication logic downstream to maintain a continuously updated article database.

Technical Notes

  • Uses Python stdlib xml.etree.ElementTree for XML parsing (no lxml dependency)
  • HTTP requests via httpx with async support and configurable timeouts
  • Handles BOM-prefixed feeds and common encoding edge cases
  • Date parsing supports RFC 822 (RSS) and ISO 8601 (Atom) formats
  • Full content extraction removes <script>, <style>, and <noscript> blocks before stripping HTML

Pricing

$0.0005 per article parsed and pushed to the dataset.

Example Feeds to Get Started

FeedURL
BBC Newshttps://feeds.bbci.co.uk/news/rss.xml
Hacker Newshttps://hnrss.org/frontpage
TechCrunchhttps://techcrunch.com/feed/
ArXiv CS.AIhttp://arxiv.org/rss/cs.AI
Reddit r/technologyhttps://www.reddit.com/r/technology/.rss

๐Ÿ”— More Scrapers by Ken Digital

ScraperWhat it doesPrice
YouTube Channel ScraperVideos, stats, metadata$0.001/video
France Job ScraperWTTJ + France Travail + Hellowork$0.005/job
France Real Estate Scraper5 sources + DVF price analysis$0.008/listing
Website Content CrawlerHTML โ†’ Markdown for AI/RAG$0.001/page
Google Trends ScraperKeywords, regions, related queries$0.002/keyword
GitHub Repo ScraperStars, forks, languages, topics$0.002/repo
RSS News AggregatorMulti-source feed parsing$0.0005/article
Instagram Profile ScraperFollowers, bio, posts$0.0015/profile
Google Maps ScraperBusinesses, reviews, contacts$0.002/result
TikTok ScraperVideos, likes, shares$0.001/video
Google SERP ScraperSearch results, PAA, snippets$0.003/search
Trustpilot ScraperReviews, ratings, sentiment$0.001/review

๐Ÿ‘‰ View all scrapers

๐Ÿ”— Quick Integration

Python

from apify_client import ApifyClient
client = ApifyClient("YOUR_API_TOKEN")
run = client.actor("joyouscam35875/rss-news-aggregator").call(run_input={...})
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
print(item)

Node.js

import{ ApifyClient }from'apify-client';
const client =newApifyClient({token:'YOUR_API_TOKEN'});
const run =await client.actor('joyouscam35875/rss-news-aggregator').call({...});
const{ items }=await client.dataset(run.defaultDatasetId).listItems();

No-code: Make / Zapier / n8n

Search for this actor in the Apify connector. No code needed.

You might also like

RSS Feed Reader

automation-lab/rss-feed-reader

This actor fetches and parses RSS 2.0, Atom, and RSS 1.0 (RDF) feeds into clean, structured JSON data. It extracts article titles, links, publication dates, authors, categories, descriptions, and full content. Use it to monitor news sources, aggregate blog posts, or build content pipelines.

๐Ÿ‘ User avatar

Stas Persiianenko

67

RSS News Aggregator

louvre/rss-news-aggregator

Aggregate dozens of RSS and Atom feeds into one chronologically sorted JSON stream in seconds. Returns title, link, publish date, author, categories, and media per article, grouped by source domain. Use for news monitoring, content curation, and LLM/AI feed pipelines.

Google News Article Scraper

webscrap18/google-news-article-scraper

Scrape Google News, Extract full content with Title, Article Text, Images and Structured data.

RSS Feed Reader - Bulk RSS & Atom Feed Parser

logiover/bulk-rss-feed-reader

Read and parse RSS, Atom and RDF feeds in bulk, or auto-discover feeds from any website. Extract thousands of articles with full metadata for news monitoring, content aggregation and AI/RAG pipelines. No API key, export to CSV or JSON.

Multi-Source News & Content Scraper

moving_beacon-owner1/rss-feeds----multi-source-news-content-scraper

Multi-Source News & Content Scraper. Aggregates articles from multiple RSS/Atom feeds simultaneously. Includes 60+ pre-built news source presets and supports custom feed URLs. No API key required.

7

RSS Feed Scraper โ€” News Scraper & Article Extractor

scrapepilot/rss-feed-scraper----news-scraper-article-extractor

Scrape any RSS or Atom news feed. Get article title, URL, description, author, date, category, and image. 20+ built-in presets: BBC, Reuters, TechCrunch, CNN, NYT, Wired & more. Optional full article text. No login. $6.99/month. 2-hour free trial.

PH News API - Multi-Source RSS Aggregator

nekohaii/philippine-news-scraper

Aggregate Philippine news from PhilStar, BusinessWorld, and Rappler via RSS. Full article text, excerpts, categories, author info, and metadata. Supports keyword filtering and per-source limits.

๐Ÿ‘ User avatar

Joey Del Rosario

2