VOOZH about

URL: https://apify.com/fascinating_lentil/hacker-news-intelligence-scraper

โ‡ฑ Hacker News Scraper for Stories & Comments ยท Apify


Pricing

from $0.75 / 1,000 hacker news item scrapeds

Go to Apify Store

Hacker News Intelligence Scraper

Scrape Hacker News stories, comments, jobs, Ask HN, Show HN, and keyword search results. Export clean JSON or CSV with scores, authors, URLs, dates, filters, and nested discussions. No login or API key required.

Pricing

from $0.75 / 1,000 hacker news item scrapeds

Rating

0.0

(0)

Developer

๐Ÿ‘ Md Jakaria Mirza

Md Jakaria Mirza

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

0

Monthly active users

2 days ago

Last modified

Share

Hacker News Scraper - Stories, Comments, Scores & Search

Scrape Hacker News stories, comments, scores, and search results into clean, structured data. This Hacker News scraper collects top, new, best, Ask HN, Show HN, and jobs feeds, runs full-text keyword search, fetches specific item IDs, and optionally nests comment threads. Export to JSON, CSV, Excel, or HTML, or pull via the Apify API. No login and no API key required.

Built with Node.js 20, TypeScript, and the Apify SDK. It uses the official Hacker News Firebase API for feeds and item details and the public HN Algolia Search API for keyword search, with retries and bounded concurrency so runs are reliable and repeatable. No browser and no proxy are required.

What It Extracts

  • id - numeric Hacker News item ID
  • type - item type (story, comment, job, poll)
  • title - story or item title
  • text and textHtml - item body as plain text and HTML
  • url - external link, when present
  • hnUrl - the Hacker News discussion URL
  • domain - hostname of the external link
  • author - HN username
  • score - points
  • commentCount - number of comments
  • parentId - parent item ID for comments
  • pollId and pollParts - poll references
  • createdAt and createdAtUnix - publication time (ISO and Unix)
  • rank - position in the feed
  • feed - source feed (top, new, best, ask, show, jobs)
  • query - the search query that matched the record
  • dead and deleted - moderation flags
  • comments - optional nested comment threads (id, author, text, textHtml, score, createdAt, createdAtUnix, parentId, depth, dead, deleted, hnUrl)
  • collectedAt - scrape timestamp

Use Cases

  1. Monitor developer, startup, and technology trends across the top, new, and best feeds.
  2. Track product, brand, and competitor mentions with keyword and domain filters.
  3. Discover Show HN launches and emerging tools as they appear.
  4. Analyze Ask HN discussions and developer sentiment with nested comments.
  5. Collect Hacker News jobs feed data for recruiting and hiring research.
  6. Build alerts, dashboards, and datasets for AI and research pipelines.

Pricing

This Actor uses Apify Pay Per Event pricing. You pay only for clean records delivered to the dataset. Empty, filtered, or failed records are not charged.

Event namePrice per event1,000 results10,000 results
item-scraped$0.00075$0.75$7.50

Input

FieldTypeRequiredDefaultDescription
modestringyesfeedfeed, search, or items
feedstringyestopTop, new, best, ask, show, or jobs
querystringyesartificial intelligenceKeyword query for search mode
searchTypestringyesstorySearch stories or comments
itemIdsstring[]no-HN item IDs for items mode
maxResultsintegeryes100Maximum dataset records (1-1000)
minScoreintegerno0Minimum points
minCommentsintegerno0Minimum story comments
includeKeywordsstring[]no[]Require at least one title/text keyword
excludeKeywordsstring[]no[]Exclude matching title/text keywords
authorsstring[]no[]Exact HN usernames
domainstringnoemptyRequired substring in external hostname
fromDatestringnoemptyEarliest publication date
toDatestringnoemptyLatest publication date
includeCommentsbooleannofalseNest comments in each record
maxCommentsPerItemintegerno50Nested comment limit per record (1-500)
commentDepthintegerno3Reply depth limit (1-10)
includeDeadOrDeletedbooleannofalseInclude dead or deleted items

Example input

{
"mode":"feed",
"feed":"show",
"maxResults":25,
"minScore":10,
"includeKeywords":["AI","developer"],
"includeComments":true,
"maxCommentsPerItem":20,
"commentDepth":2
}

How to Scrape Hacker News (Step by Step)

  1. Click Try for free / Run.
  2. Pick a mode: a feed (top, new, best, ask, show, jobs), keyword search, or specific itemIds.
  3. Add filters such as minScore, includeKeywords, authors, domain, or a date range to narrow results.
  4. Set maxResults (start small to test) and toggle includeComments if you want nested threads.
  5. Run the Actor, then export results as JSON, CSV, Excel, or HTML, or pull them via the Apify API.

Sample Output

{
"id":48487029,
"type":"story",
"title":"Show HN: An open-source AI agent for developers",
"text":null,
"textHtml":null,
"url":"https://github.com/example/ai-agent",
"hnUrl":"https://news.ycombinator.com/item?id=48487029",
"domain":"github.com",
"author":"devbuilder",
"score":287,
"commentCount":94,
"parentId":null,
"pollId":null,
"pollParts":[],
"createdAt":"2026-06-11T05:22:06.000Z",
"createdAtUnix":1781155326,
"rank":1,
"feed":"show",
"query":null,
"dead":false,
"deleted":false,
"comments":[
{
"id":48487102,
"author":"curious_hacker",
"text":"This is great, how does it handle rate limits?",
"textHtml":"<p>This is great, how does it handle rate limits?</p>",
"score":0,
"createdAt":"2026-06-11T05:41:12.000Z",
"createdAtUnix":1781156472,
"parentId":48487029,
"depth":1,
"dead":false,
"deleted":false,
"hnUrl":"https://news.ycombinator.com/item?id=48487102"
}
],
"collectedAt":"2026-06-11T06:00:00.000Z"
}

How It Works

  1. Validates the input and selects the collection mode (feed, search, or items).
  2. Fetches feed and item details from the official Hacker News Firebase API, and keyword results from the public HN Algolia Search API.
  3. Applies score, comment, keyword, author, domain, and date filters.
  4. Optionally fetches and nests comment threads up to your depth and count limits.
  5. Charges item-scraped only after a clean record is saved, then writes it to the Apify Dataset.

Known Limits

  • text and textHtml are only present for items that have body content; link stories return null for these fields.
  • comments are only populated when includeComments is enabled, and are bounded by maxCommentsPerItem and commentDepth.
  • Keyword search uses the HN Algolia API, so results and ranking follow that service's coverage and indexing.
  • Dead or deleted items are excluded unless includeDeadOrDeleted is enabled.
  • maxResults is capped at 1,000 records per run.

Data Sources

This Actor uses the official Hacker News Firebase API and the public HN Algolia Search API. It does not rely on fragile page selectors.

License

Apache-2.0.

You might also like

Craigslist Scraper

viralanalyzer/craigslist-scraper

Scrape Craigslist listings across all categories and locations. Extract prices, descriptions, images, and contact details.

8

4.7

(3)

Hacker News Intelligence - Tech Jobs Startups Trends API

benthepythondev/hacker-news-intelligence

Extract trending tech stories, job postings, startup launches (Show HN), discussions (Ask HN) from Hacker News with AI engagement scoring (0-100). Monitor tech trends, recruit developers, discover startups. Filter by score, comments, time, keywords.

Stackoverflow Intelligence

viralanalyzer/stackoverflow-intelligence

Scrape Stack Overflow questions, answers, tags, and user profiles. Analyze developer trends and technology adoption patterns.

3

5.0

(3)

Github Trending Scraper

viralanalyzer/github-trending-scraper

Scrape GitHub trending repositories, stars, forks, languages, and developer profiles. Track open source trends daily/weekly/monthly.

25

5.0

(3)

Hackernews Intelligence

viralanalyzer/hackernews-intelligence

Scrape Hacker News stories, comments, and discussions. Track tech trends, startup news, and developer community sentiment.

5

5.0

(3)

Hacker News Stories & Comments Scraper

taroyamada/hacker-news-intelligence

Extract trending tech discussions, nested comment hierarchies, and post scores from Hacker News directly into structured JSON for custom RAG pipelines.

Product Hunt Scraper - Daily Launches & Trending Products API

benthepythondev/producthunt-scraper

Extract Product Hunt launches with upvotes, comments, taglines, and trending scores. Scrape today's products, search by keyword, or browse by topic. Get engagement metrics for market research, competitor tracking, and startup discovery. Fast, reliable, pay-per-result.

Hacker News Search โ€” Stories, Comments & Developer Sentiment

ryanclinton/hackernews-search

Search and extract stories, comments, polls, Show HN, and Ask HN posts from Hacker News. This actor uses the Algolia HN Search API to find content by keyword, filter by author, date range, minimum points, and comment count -- then returns clean, structured JSON ready for analysis, monitoring, or ...

30

Hacker News Scraper โ€” Stories, Comments & Jobs

cryptosignals/hackernews-scraper

Scrape Hacker News stories, comments, and user profiles โ€” extract title, URL, score, author, comment threads, and submission time. CSV/JSON output.

6