Blog Scraper

Pricing

from $33.00 / 1,000 standard-fetches

Blog Scraper

Company Blog Scraper, Blog Post Scraper, Corporate Blog Crawler, Automatic Blog Discovery, Blog Content Extractor, Article Metadata Scraper, Multi-Domain Blog Scraper, Competitor Blog Analysis, Content Marketing Scraper, Blog Post Metadata Extraction, Company Announcements Scraper.

Pricing

from $33.00 / 1,000 standard-fetches

Rating

0.0

(0)

Developer

👁 Wyald

Wyald

Maintained by Community

Actor stats

Bookmarked

Total users

Monthly active users

56 days

Issues response

6 months ago

Last modified

Targeted Keywords

Primary: Blog Scraper, Content Extraction, Company Blog Crawler, Article Scraper
Secondary: Blog Post Metadata, Content Marketing Analysis, Blog Content Aggregation, Corporate Blog Mining

Features

✅ Automatic Blog Discovery: Intelligently finds blog sections on company websites ✅ Smart Content Extraction: Extracts comprehensive blog post data including: * Title * Author * Publication date * Full article content * Excerpt/summary * Tags * Category * URL ✅ Configurable Limits: Set maximum number of posts per domain (up to 50) ✅ Multiple Domain Support: Scrape from multiple company websites in a single run ✅ Structured Output: Returns clean JSON data with all metadata ✅ Fast & Lightweight: Uses crawlee with BeautifulSoup for efficient HTTP-based scraping (no headless browser overhead)

Input

Field	Type	Description	Required	Default
`company_urls`	Array	List of company domain URLs or homepage URLs to scrape (e.g., `["https://stripe.com", "shopify.com"]`).	Yes	-
`max_blogposts_to_fetch`	Number	Maximum number of blog posts to fetch per domain (1-50)	No	10
`max_concurrency`	Number	Number of concurrent requests	No	2

Input Example

{
"company_urls":[
"https://www.stripe.com",
"https://shopify.com",
"https://ai-bees.io"
],
"max_blogposts_to_fetch":10,
"max_concurrency":2
}

Output Example

{
"url":"https://www.stripe.com/blog/example-post",
"domain":"www.stripe.com",
"post_title":"How we scaled our payment infrastructure",
"author":"Jane Doe",
"published_date":"2024-01-15",
"content":"Full article content here...",
"excerpt":"Learn how we scaled our payment infrastructure to handle millions of transactions...",
"tags":["engineering","infrastructure","scaling"],
"category":"Engineering",
"scraped_at":"2024-01-20T10:30:00.000Z"
}

How It Works

Domain Analysis: The scraper starts by visiting each provided company domain
Blog Detection: It automatically searches for blog sections using common patterns (/blog, /news, /articles, etc.)
Post Discovery: Once in the blog section, it identifies individual blog post URLs
Content Extraction: For each post, it extracts:
- Structured metadata (title, author, date)
- Full article content
- Additional metadata (tags, categories)
Limit Enforcement: Respects the number_of_blog_posts_to_fetch limit per domain

Usage Tips

URL Format: You can provide URLs with or without https:// - the scraper will normalize them
Rate Limiting: The scraper includes automatic delays to be respectful to target websites
Post Limits: Maximum 50 posts per domain to prevent excessive scraping
Concurrency: Adjust max_concurrency based on target website capacity (default: 2)

Use Cases

Content Marketing Analysis: Analyze competitor blog strategies
Content Aggregation: Collect blog content for research or analysis
Market Intelligence: Monitor company announcements and thought leadership
SEO Research: Study content patterns and topics from successful blogs
Training Data: Collect blog content for ML/AI model training

Notes

The scraper respects robots.txt and includes reasonable delays between requests
Blog structure varies by website - extraction quality depends on site structure
Some blogs may require authentication or have anti-scraping measures
Always ensure you have permission to scrape the target websites

Blog Scraper

assured_crown/blog-scraper

👁 User avatar

Ben

Blog Aggregator

skystone_labs/blog-aggregator

Extract blog posts from RSS/Atom feeds or blog homepages. Get titles, authors, publish dates, content, excerpts, and tags. Perfect for content monitoring, news aggregation, and research.

👁 User avatar

Skystone

Naver Blog Search Scraper

oxygenated_quagmire/naver-blog-search

Search & scrape posts from Naver Blog (네이버 블로그) — Korea's largest blogging platform. Extract naver blog scraper results, korean blog data, full article text, engagement metrics. Best naver blog api alternative for korean web scraping & NLP workflows.

👁 User avatar

Session zero

👁 Replicate Blog Scraper avatar

Replicate Blog Scraper

yourapiservice/replicate-blog-scraper

The Replicate Blog Scraper lets you easily extract blog content in HTML or plaintext formats. It also captures key metadata like author and publication date, making it a great tool for content analysis and research.

👁 User avatar

Your API Service

Hashnode Blog Scraper - Extract Developer Blog Posts

klondikeking/hashnode-blog-scraper

Extract blog posts from Hashnode homepage and tag pages. Get titles, descriptions, publish dates, word counts, author info, and featured images. Perfect for content research, trend analysis, and lead generation.

👁 User avatar

Pierrick McD0nald

Naver Place Blog Review Scraper

oxygenated_quagmire/naver-blog-reviews

Scrape long-form blog review posts from Naver Place (네이버 플레이스) blog tab. Extract korean blog reviews, 내돈내산 authentic reviews, naver blog review data, korean influencer content, naver cafe reviews. Filter sponsored vs genuine customer reviews.

👁 User avatar

Session zero

👁 Sort Your Photos Blog Scraper avatar

Sort Your Photos Blog Scraper

yourapiservice/sortyourphotos-blog-scraper

Sort Your Photos Blog Scraper (sortyourphotos.com) lets you extract blog content in HTML, JSON, and plaintext. Get authors, create/update date, images, read time, RSS, titles, SEO titles, featured images & videos, and keywords easily for content analysis and aggregation.

👁 User avatar

Your API Service

👁 youtube-transcript-scraper avatar

youtube-transcript-scraper

cjsolt13/youtube-transcript-scraper

for blog and product development

👁 User avatar

Claudia Solt-Ames

👁 Be The One Best Blog Scraper avatar

Be The One Best Blog Scraper

yourapiservice/betheonebest-blog-scraper

Be The One Best Blog Scraper (betheonebest.com) lets you extract blog content in HTML, JSON, and plaintext. Get authors, create/update date, images, read time, RSS, titles, SEO titles, featured images & videos, and keywords easily for content analysis and aggregation.

👁 User avatar

Your API Service

👁 Media Partnership Blog Scraper avatar

Media Partnership Blog Scraper

yourapiservice/mediapartnership-blog-scraper

Media Partnership Blog Scraper (mediapartnership.co.uk) lets you extract blog content in HTML, JSON, and plaintext. Get authors, create/update date, images, read time, RSS, titles, SEO titles, featured images & videos, and keywords easily for content analysis and aggregation.

👁 User avatar

Your API Service

URL: https://apify.com/naive_zing/blog-scraper

⇱ Blog Scraper · Apify

Blog Scraper

Targeted Keywords

Features

Input

Input Example

Output Example

How It Works

Usage Tips

Use Cases

Notes

You might also like

Blog Scraper

Blog Aggregator

Naver Blog Search Scraper

Replicate Blog Scraper

Hashnode Blog Scraper - Extract Developer Blog Posts

Naver Place Blog Review Scraper

Sort Your Photos Blog Scraper

youtube-transcript-scraper

Be The One Best Blog Scraper

Media Partnership Blog Scraper