VOOZH about

URL: https://apify.com/crawlerbros/hackernews-scraper

โ‡ฑ Hacker News Scraper ยท Apify


Pricing

from $1.00 / 1,000 results

Go to Apify Store

Scrape Hacker News stories, comments, jobs, and user profiles. Modes: top/new/best/ask/show/jobs/past/item/user/search. Filters: minScore, domainFilter, dateRange, commentMinScore. No proxy, no auth.

Pricing

from $1.00 / 1,000 results

Rating

0.0

(0)

Developer

๐Ÿ‘ Crawler Bros

Crawler Bros

Maintained by Community

Actor stats

0

Bookmarked

6

Total users

4

Monthly active users

2 months ago

Last modified

Share

Scrape Hacker News stories, comments, jobs, and user profiles via the official Firebase + Algolia public APIs. No login, no cookies, no proxy. Modes for top / new / best / Ask HN / Show HN / Jobs / Past โ€” plus direct item lookup, user profile fetch, and Algolia full-text search.

What this actor does

  • Walks the public Hacker News feeds (top, new, best, ask, show, jobs)
  • Fetches individual stories or comment threads by ID
  • Pulls user profiles (karma, account age, submitted item count)
  • Runs full-text search across HN history via the Algolia HN API
  • Builds nested comment trees on demand (enableCommentHierarchy=true) or emits flat comment rows otherwise

Modes

ModeWhat it does
topStoriesFront-page top stories
newStoriesNewest submissions
bestStoriesBest-of recent submissions
askStoriesAsk HN posts
showStoriesShow HN posts
jobStoriesYC company job ads
itemDirect lookup of specific item IDs
userDirect lookup of user profiles
searchAlgolia full-text search across all HN history

Output per story

  • id, hnUrl, type, title
  • url, domain (parsed host)
  • score, numComments, author
  • createdAt (ISO-8601 UTC), createdAtEpoch, ageHours
  • text (Ask/Show/Job posts only โ€” HN-flavored markdown converted to plain text)
  • kids (list of top-level comment IDs)
  • comments[] (nested replies โ€” only when enableCommentHierarchy=true)
  • dead / deleted (boolean โ€” only when true)
  • recordType: "story", scrapedAt

Output per comment

  • id, hnUrl, storyId, parentId, depth
  • author, text (HTML โ†’ plain text)
  • createdAt, createdAtEpoch, ageHours
  • kids (list of reply IDs), replies[] (nested โ€” only when enableCommentHierarchy=true)
  • recordType: "comment", scrapedAt

Output per user

  • id, profileUrl
  • karma, createdAt, createdAtEpoch, about
  • submittedCount, submitted (capped at 50 most-recent)
  • recordType: "user", scrapedAt

Empty fields are omitted from the output (no nulls).

Input

FieldTypeDefaultDescription
modeenumtopStoriestopStories / newStories / bestStories / askStories / showStories / jobStories / item / user / search
itemIdsarray[]When mode=item: numeric HN item IDs
usernamesarray[]When mode=user: HN usernames
searchQuerystringโ€“When mode=search: full-text query
startUrlsarray[]HN URLs (item?id=N or user?id=USER) โ€” auto-routed to itemIds / usernames
enableCommentHierarchybooleanfalseWhen true, attach a nested comments[] array to each story instead of emitting flat sibling rows
maxItemsint100Hard cap on emitted records (1โ€“5000)
maxCommentsint0Cap comments fetched per story. 0 = skip comments.
maxDepthint5Maximum reply depth when maxComments > 0.
minScoreintโ€“Drop stories below this score
domainAllowlistarray[]Only emit stories whose URL host contains one of these substrings
domainBlocklistarray[]Drop stories whose URL host contains one of these substrings
excludeDeadOrDeletedbooltrueDrop items flagged dead/deleted by HN
dateRangeFromstringโ€“Drop items posted before this ISO-date (UTC)
dateRangeTostringโ€“Drop items posted after this ISO-date (UTC)
commentMinScoreintโ€“Drop comments below this score
commentAuthorFilterarray[]Only emit comments by these usernames
minCommentCountintโ€“Drop stories with fewer than this many comments

Example: top stories with filters

{
"mode":"topStories",
"maxItems":50,
"minScore":100,
"domainBlocklist":["twitter.com","x.com"],
"excludeDeadOrDeleted":true
}

Example: full-text search

{
"mode":"search",
"searchQuery":"rust async runtime",
"maxItems":100,
"minScore":50
}

Example: story with comment tree

{
"mode":"item",
"itemIds":["12345678"],
"maxComments":200,
"maxDepth":3,
"enableCommentHierarchy":true
}

Use cases

  • Trend monitoring โ€” track which domains hit the front page each week
  • Comment intelligence โ€” pull every comment for an Ask HN thread to study reactions
  • YC job-ads digest โ€” weekly extract of jobStories for the careers newsletter
  • User research โ€” fetch a user's submission history + karma stats for outreach
  • Search-driven enrichment โ€” feed Algolia search results into a downstream tagger

FAQ

Does it require a login or cookies? No. Both Firebase and Algolia HN APIs are fully public.

Is a proxy needed? No. The actor works from datacenter IPs without any proxy.

Why are some stories missing a url? Ask HN / Show HN posts are self-text โ€” they have a text field instead of a url. The omit-empty contract drops the url field on these.

Why does commentMinScore not filter much? Hacker News rarely exposes per-comment scores. When the field is missing the comment is kept.

What's the difference between maxItems and maxComments? maxItems caps the total emitted records (stories + comments + users combined). maxComments caps how deep the actor goes into each story's comment tree per story.

How fresh is the data? Real-time. Both APIs serve the live HN database.

You might also like

Hacker News Scraper

crawlerbros/hackernews-tech-stories-scraper

Scrape Hacker News stories, comments, jobs, and user profiles via the official Firebase API. Modes: top/new/best/ask/show/jobs/past/item/user/search. Filters: minScore, domainFilter, dateRange, commentMinScore. No proxy, no auth.

Hacker News Scraper

cloud9_ai/hackernews-scraper

Scrape Hacker News stories, comments, and user profiles via official Firebase API. Get top, new, best, ask, show stories with scores, comments, and author data.

Hacker News Scraper: Stories, Comments, Users & Search

perconey/hackernews-scraper

Scrape Hacker News via the official Firebase API + Algolia search. Top/new/best/ask/show/jobs stories, full comment trees, user profiles with karma, free-text search. No browser, no proxies, no auth. Pay only per result item.

Hacker News Scraper - Stories, Comments & Search

legend006/hackernews-scraper

Scrape Hacker News stories, comments, polls, jobs, and Ask/Show HN posts. Search by keyword and date range, pull a user's full activity, or fetch any HN list (front page, new, best, ask, show, job). No auth required. Built for AI/ML datasets, tech trend research, and news monitoring.

Hacker News Scraper

gentle_cloud/hacker-news-scraper

Scrape Hacker News stories, comments, and user data. Supports top/new/best/ask/show/job story feeds and full-text keyword search via the Algolia API. Extract titles, URLs, scores, authors, comment counts, and timestamps.

57

Hacker News Stories, Comments & Users Scraper

crawlerbros/hacker-news-scraper

Scrape Hacker News - search stories and comments, fetch top/new/best stories, get user profiles and submission history. Uses the official Algolia HN Search API and Hacker News Firebase API.

Hacker News Scraper โ€” Submissions, Jobs, Users & Comments

santamaria-automations/ycombinator-scraper

Scrape Hacker News top/new/best/ask/show/job/user submissions. Returns title, author, score, comments, URL, and full text. No login required. Pay-per-result.