Junior Guru Job Scraper Demo

Pricing

Pay per usage

Try for free

Go to Apify Store

👁 Junior Guru Job Scraper Demo

Junior Guru Job Scraper Demo

Try for free

Demo Actor scraper for junior.guru talk.

Pricing

Pay per usage

Rating

0.0

(0)

Developer

👁 Kateřina Hroníková

Kateřina Hroníková

Maintained by Community

Actor stats

Bookmarked

Total users

Monthly active users

a month ago

Last modified

StartupJobs.cz demo scraper

An Apify Actor that collects developers job listings from StartupJobs.cz using their public API.

Built as a live demo for the junior.guru community talk "Web scraping: Nechte internet pracovat za vás".

What does it do?

You give it a keyword (e.g. junior, python, javascript) and it returns a list of matching developer/engineer job offers including title, company, location, salary, and a direct link. Non-tech roles (sales, marketing, etc.) are filtered out automatically.

Results are stored in an Apify Dataset and can be exported to CSV, JSON, or other formats in one click.

Prerequisites

Apify account (free tier is enough)
Node.js 18+
Apify CLI

npminstall-g apify-cli
apify login

Step 1 — Find the API using DevTools

Before writing any code, open startupjobs.cz/nabidky in your browser and explore how it loads data.

Press F12 to open DevTools
Go to the Network tab
Filter by Fetch/XHR
Reload the page or type a keyword in the search box
Look for a request to /api/offers

You'll see something like:

GEThttps://www.startupjobs.cz/api/offers?keyword=junior&limit=20&page=1

Open it in a new tab — you get clean JSON back. No HTML parsing needed. 🎉

{
"resultSet":[
{
"name":"Junior TypeScript Developer",
"company":"Acme s.r.o.",
"url":"/nabidka/12345/junior-typescript-developer",
"locations":"Praha",
"isRemote":true,
"seniorities":["junior"],
"areaSlugs":["back-end-vyvojar","vyvoj"],
"salary":{"min":40000,"max":60000,"currency":"CZK","measure":"monthly"}
}
]
}

Step 2 — Walk through the code

The entire actor is in src/main.ts. Here's what it does:

await Actor.init();
const{ keyword ='', seniority ='', maxResults =50}=await Actor.getInput()??{};
while(collected < maxResults){
// 1. Call the StartupJobs API — plain fetch(), JSON response
const response =awaitfetch(`${API_URL}?keyword=${keyword}&page=${page}`);
const{ resultSet: offers }=await response.json();
for(const offer of offers){
// 2. Skip non-developer roles (sales, marketing, etc.) and wrong seniority
const isDevRole = offer.areaSlugs.some((slug)=>DEV_AREA_SLUGS.has(slug));
const isSeniorityMatch =!seniority || offer.seniorities.includes(seniority);
if(!isDevRole ||!isSeniorityMatch)continue;
// 3. Pick the fields we care about and save to Apify Dataset
await Actor.pushData({
 title: offer.name,
 company: offer.company,
 url:`${BASE_URL}${offer.url}`,
// ...
});
}
}

Three concepts, that's it: fetch → filter → save.

StartupJobs has a clean API, so we get JSON directly. If it didn't, we'd have to fetch the HTML page and extract data from it using CSS selectors — this is called parsing:

// Without an API you'd do something like this instead:
import*as cheerio from'cheerio';
const response =awaitfetch('https://www.startupjobs.cz/nabidky?q=javascript');
const html =await response.text();// raw HTML string, not JSON
const $ = cheerio.load(html);// parse the HTML
$('.offer-title').each((_, el)=>{// find all elements matching a CSS selector
const title =$(el).text().trim();// extract the text content
const url =$(el).attr('href');// or an attribute
console.log(title, url);
});

HTML structure changes whenever the site redesigns — APIs are much more stable.

Step 3 — Run locally

# Install dependencies
npminstall
# Run without building (great for development)
npm run dev
# Or build first, then run
npm run build
npm start

To set a custom keyword, create storage/key_value_stores/default/INPUT.json:

{
"keyword":"javascript",
"seniority":"junior",
"maxResults":20
}

Step 4 — Deploy to Apify

$apify push

Your actor is now live at console.apify.com under My Actors.

Step 5 — Schedule & export

Run on a schedule — e.g. every morning at 8:00:

Open your actor in Apify Console
Go to Schedules → + New Schedule
Set cron: 0 8 * * 1-5 (Mon–Fri at 8:00)

Export results:

Dataset → Export → CSV / JSON
Or connect directly to Gmail via Apify integrations

Build your own scraper

Want to scrape a different site? You can use this repo as a starting point.

Pick your starting point based on what the target site looks like:

Situation Template
Site has a JSON API (like this demo) Clone this repo
No API, static HTML ts-crawlee-cheerio
No API, heavy JavaScript / dynamic content ts-crawlee-playwright
```
$apify create my-scraper --template ts-crawlee-cheerio
```
Find the data source — open the target site in your browser, go to DevTools → Network → Fetch/XHR, and look for an API call returning JSON. If there's no API, switch to the Elements tab and find the CSS selectors for the data you need.
Edit src/main.ts — replace the fetch() URL and the fields inside Actor.pushData({...}) with whatever your target API or page returns. The structure stays the same: fetch → filter → save.
Update .actor/input_schema.json to define the inputs your scraper needs (keywords, URLs, limits, etc.).
Run locally with npm run dev, then deploy with apify push.

Situation	Template
Site has a JSON API (like this demo)	Clone this repo
No API, static HTML	`ts-crawlee-cheerio`
No API, heavy JavaScript / dynamic content	`ts-crawlee-playwright`

The Apify documentation and Academy are great next steps from here.

Going further

What	How
Compare day-over-day	Store results with a timestamp, diff on next run
Scrape a JS-heavy site	Switch to `PlaywrightCrawler` from Crawlee
Browse 29 000+ ready-made scrapers	apify.com/store

Glossary

Web scraping — Automatically collecting data from websites by sending requests and extracting the relevant parts from the response (HTML or JSON).

Server — A computer (or program) that listens for requests over the internet and sends back a response. When you open a website, your browser sends a request to a server, which replies with the page content.

API (Application Programming Interface) — A formal agreement between two programs on how to exchange data: what you can ask for, how to ask it, and what format the answer comes back in. This scraper uses StartupJobs' public API, which means we get clean JSON instead of having to dig through HTML.

Parsing — Analyzing and processing structured text (HTML or JSON) to pull out specific pieces of data. When a site has no API, you parse the raw HTML to find what you need.

JS site (JavaScript-rendered site) — A site that builds its content in the browser using JavaScript. A plain HTTP request returns only an empty shell — the actual data isn't in the source HTML at all. You need a headless browser to load these properly.

Headless browser — A web browser that runs without a visible window. It works exactly like a normal browser (loads pages, runs JavaScript, processes CSS), but everything happens in memory in the background. Used to scrape JS-rendered sites.

LLM (Large Language Model) — A type of AI trained on massive amounts of text, capable of understanding and generating human-like language. In scraping, LLMs can help extract or structure data from unstructured text that would be hard to parse with code alone.

Proxy — An intermediary server between you and the target website. Your requests go through it, so the website sees the proxy's IP address instead of yours. Used to avoid IP bans when scraping at scale.

Resources

Apify SDK for JavaScript/TypeScript
Apify Academy — Web scraping for beginners
junior.guru — community and handbook for junior developers in CZ/SK
Talk slides

👁 Guru Jobs Freelancer Scraper avatar

Guru Jobs Freelancer Scraper

getdataforme/Guru-Jobs-Scraper

Scrape the freelancers profile of Guru Jobs

👁 User avatar

GetDataForMe

My Demo Actor

simon.m/my-demo-actor

👁 User avatar

Šimon Maresz

👁 Guru Freelancer Scraper avatar

Guru Freelancer Scraper

piotrv1001/guru-freelancer-scraper

The Guru Freelancer Scraper extracts freelancer listings and enriched profiles from Guru.com directories, capturing names, skills, hourly rates, earnings, feedback scores, membership levels, and location data — ideal for talent sourcing, competitive analysis, and freelance market research.

👁 User avatar

FalconScrape

👁 Guru.com Scraper avatar

Guru.com Scraper

shahidirfan/guru-com-scraper

Unlock Guru.com data instantly! Scrape detailed user profiles and job listings with ease. Perfect for recruitment, lead generation, and market analysis. Get essential data like freelancer skills, rates, and active projects to automate your workflow efficiently.

👁 User avatar

Shahid Irfan

5.0

👁 Restaurant Guru Scraper avatar

Restaurant Guru Scraper

rainminer/restaurantguru-scraper

Extract restaurants from Restaurant Guru city listings and profile pages — ratings, cuisines, price range, addresses, opening hours, and optional customer reviews. Paste a city or restaurant URL, export JSON/CSV, schedule runs, and integrate via the Apify API.

👁 User avatar

rainminer

Public Demo Video Signal Agent

jacksu/public-demo-video-signal-agent

Extract public product demo and product-tour video evidence, providers, CTAs, hashes, and useful-result pricing.

👁 User avatar

jack su

Pycon events demo scraper

katerinahronik/pycon-events-demo-scraper

👁 User avatar

Kateřina Hroníková

PyCon Africa demo Actor

katerinahronik/pycon-africa-demo-actor

👁 User avatar

Kateřina Hroníková

👁 Guru.com Scraper | Freelance Jobs and Profiles avatar

Guru.com Scraper | Freelance Jobs and Profiles

parseforge/guru-com-scraper

Scrape freelance jobs and freelancer profiles from Guru.com with title, budget, skills, location, ratings, reviews, project descriptions, and apply links. Source talent, monitor gig pricing, generate leads, and build freelance market intelligence for staffing and recruiting.

👁 User avatar

ParseForge

👁 .guru OpenAPI Directory Scraper avatar

.guru OpenAPI Directory Scraper

parseforge/apis-guru-openapi-directory-scraper

Tap into records from Guru Openapi Directory with name, version, description, maintainers, repository link, stars and when published. Loved by developer tooling intelligence, dependency monitoring and ecosystem research. Run on demand or on a recurring schedule and feed every row into your favour.

👁 User avatar

ParseForge

URL: https://apify.com/katerinahronik/junior-guru-job-scraper-demo