๐ง Hacker News Scraper โ Stories, Comments & Search by Keyword
Pricing
from $0.01 / 1,000 results
๐ง Hacker News Scraper โ Stories, Comments & Search by Keyword
Search and scrape Hacker News stories, comments, and polls by keyword โ points, authors, comment counts, dates, and links. Powered by the official HN API.
Pricing
from $0.01 / 1,000 results
Rating
0.0
(0)
Developer
Actor stats
0
Bookmarked
2
Total users
1
Monthly active users
a day ago
Last modified
Categories
Share
Scrape Hacker News at scale with a fast, reliable Hacker News scraper built on the official Algolia HN Search API. Search Hacker News by keyword to pull matching stories, comments, and polls, or grab the current front page and newest items โ no API key, no login, and no anti-bot headaches. Every result is emitted as one clean, structured JSON record ready for analysis, dashboards, alerting, or AI pipelines.
Whether you are tracking what Hacker News says about your product, monitoring keywords like "artificial intelligence", or building a dataset of top stories, this Hacker News scraper gets you there in seconds.
โจ Features
- ๐ Keyword search across Hacker News stories, comments, and polls.
- ๐ฐ Front page mode โ scrape the current HN front page without a query.
- ๐ Sort by relevance or date (newest first).
- ๐ฏ Numeric filters โ only keep items above a minimum points or comment count.
- ๐ Automatic pagination up to the Algolia ~1000-result cap.
- ๐งฑ Flat, structured output โ one record per result, ready for CSV/JSON/Excel export.
- ๐ก๏ธ No anti-bot issues โ uses the public Algolia HN API, so runs are cheap and stable.
๐ Quick start
Paste this input to scrape the top 10 stories about artificial intelligence:
{"query":"artificial intelligence","contentType":"story","sortBy":"relevance","maxItems":10}
Scrape the current front page (no keyword needed):
{"query":"","contentType":"front_page","sortBy":"relevance","maxItems":30}
Find the newest highly-upvoted discussions about a topic:
{"query":"rust programming","contentType":"story","sortBy":"date","minPoints":50,"maxItems":100}
โ๏ธ Input
| Field | Type | Default | Description |
|---|---|---|---|
query | string | "artificial intelligence" | Keyword or phrase to search for. Leave empty to fetch the latest items / front page. |
contentType | select | story | What to scrape: story, comment, poll, or front_page. |
sortBy | select | relevance | relevance (best match) or date (newest first). |
maxItems | integer | 50 | Maximum total results (1โ1000; Algolia caps near 1000). |
minPoints | integer | โ | Only keep items with at least this many points. |
minComments | integer | โ | Only keep items with at least this many comments. |
proxyConfiguration | proxy | { "useApifyProxy": true } | Proxy settings. Datacenter proxies work fine here. |
๐ค Output
Each result is pushed as one record to the dataset. Example story record:
{"query":"artificial intelligence","objectID":"39038064","title":"The rise of artificial intelligence agents","url":"https://example.com/ai-agents","author":"pg","points":412,"numComments":187,"createdAt":"2026-01-12T09:33:00.000Z","createdAtTimestamp":1768210380,"hnUrl":"https://news.ycombinator.com/item?id=39038064","storyText":null,"tags":["story","author_pg","story_39038064"]}
Comment records additionally include commentText, storyId, and parentId.
| Field | Description |
|---|---|
query | The search query used for the run. |
objectID | Unique Hacker News item ID. |
title | Story/poll title (null for comments). |
url | External link (null for Ask/Show HN and text posts). |
author | Hacker News username of the author. |
points | Score / upvotes. |
numComments | Number of comments on the item. |
createdAt | ISO 8601 creation timestamp. |
createdAtTimestamp | Unix creation timestamp. |
hnUrl | Canonical Hacker News discussion URL. |
storyText | HTML-stripped self/Ask HN text (if any). |
tags | Algolia _tags array. |
commentText | (Comments only) HTML-stripped comment body. |
storyId | (Comments only) ID of the parent story. |
parentId | (Comments only) ID of the direct parent item. |
โ FAQ
Do I need a Hacker News API key? No. This Hacker News scraper uses the free, public Algolia HN Search API โ no key or login.
How many results can I get?
The Algolia HN API caps results at roughly 1000 per query. Set maxItems accordingly.
Why is url sometimes null?
Ask HN, Show HN, and text posts have no external link, so url is null. Use hnUrl for the
discussion page and storyText for the body.
Can I scrape only comments?
Yes โ set contentType to comment. Records will include commentText, storyId, and parentId.
Will I get rate-limited or blocked? The Algolia HN API is very tolerant and has no anti-bot protection, so datacenter proxies are fine.
๐ก Tips
- Use
sortBy: "date"withminPointsto build a feed of fresh, already-popular discussions. - Combine
querywithcontentType: "comment"to mine sentiment and opinions on a topic. - Leave
queryempty and setcontentType: "front_page"to snapshot the HN front page on a schedule. - Schedule this actor to run hourly to monitor a keyword and feed results into Slack or a webhook.
