VOOZH about

URL: https://apify.com/memo23/cfy-scraper

⇱ Community First Yorkshire Jobs Scraper Β· Apify


πŸ‘ Community First Yorkshire Jobs Scraper avatar

Community First Yorkshire Jobs Scraper

Pricing

from $1.99 / 1,000 results

Go to Apify Store

Community First Yorkshire Jobs Scraper

Scrape jobs and other portfolio content from communityfirstyorkshire.org.uk via WP-JSON portfolio CPT. Filter by taxonomy (default jobs β‰ˆ 6 vacancies). Title, full HTML, location, apply email/URL, best-effort closing date + salary regex. JSON or CSV out.

Pricing

from $1.99 / 1,000 results

Rating

0.0

(0)

Developer

πŸ‘ Muhamed Didovic

Muhamed Didovic

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

17 days ago

Last modified

Categories

Share

Scrape jobs (and other portfolio content) from communityfirstyorkshire.org.uk. Uses the public WP-JSON portfolio custom post type filtered by the portfolio_entries taxonomy (default jobs term = ~6 live vacancies). Each row carries title, full description HTML, location term, apply email/URL (extracted from body), and best-effort closing date + salary. JSON or CSV out, no compute charge per run, just per result.

How it works

πŸ‘ How Community First Yorkshire Scraper works

✨ Why use this scraper?

Community First Yorkshire (CFY) is the rural voluntary-sector hub for North Yorkshire, York, and the Yorkshire Dales. Tracking who's hiring at rural Yorkshire charities? Cross-region CVS comparisons? Sourcing for paid roles outside the metro areas?

  • 🎯 Three starting points. The default Jobs taxonomy filter (set entityTerms: ["jobs"]), a direct /portfolio-item/<slug>/ URL, or any /wp-json/wp/v2/portfolio URL.
  • ⚑ WP-JSON portfolio CPT as the data source. Each item is a full WordPress portfolio entry with content, taxonomy, and _embed-able media.
  • 🏷️ portfolio_entries taxonomy split. Term names are auto-split into categories (Jobs, Leadership, Networks, etc.) vs locations (North Yorkshire, York, Pateley Bridge, Homeworking).
  • πŸ“§ Apply email/URL from body. Regex-extracted from the content HTML (first mailto: β†’ applyEmail; first outbound http href β†’ externalApplyUrl).
  • πŸ“… Closing date + salary (best-effort). Heuristic regex against body plain-text ("Closing date: …", "Β£X – Β£Y per annum"). Always falls back gracefully to null.
  • 🌐 Beyond jobs. Filter by volunteering, get-support, leadership, networks, advertise, podcast, membership to pull other portfolio content.
  • πŸ“€ Clean exports. One row per item with full HTML description inline. JSON + CSV exported automatically.

🎯 Use cases

TeamWhat they build
Rural CVS recruitersDaily new-vacancy feeds for North Yorkshire / York charities
Sector publicationsAuto-populate Yorkshire voluntary-sector jobs sections
Workforce strategyRural vs urban pay benchmarks across Yorkshire
AggregatorsApply emails / URLs for redirect-and-track use cases
Podcast / content discoveryPull podcast term for the CFY podcast catalogue

πŸ“₯ Supported inputs

URL patternBehaviour
(empty + entityTerms: ["jobs"])Default β€” Jobs only (~6 vacancies)
https://www.communityfirstyorkshire.org.uk/portfolio-item/<slug>/Single portfolio item
https://www.communityfirstyorkshire.org.uk/wp-json/wp/v2/portfolioAll portfolio entries (43 items)
https://www.communityfirstyorkshire.org.uk/wp-json/wp/v2/portfolio?portfolio_entries=87Filter by term ID (pass-through)

Not supported: browser listing pages (CFY has no public /jobs/ page β€” content is rendered into a masonry on the homepage); hosts outside communityfirstyorkshire.org.uk.

πŸ”„ How it works

  1. Resolve start URLs β€” either from explicit startUrls, or built from entityTerms (slug β†’ numeric term ID via a known map).
  2. Classify + translate each URL into the canonical /wp-json/wp/v2/portfolio shape, optionally with ?portfolio_entries=<id>&_embed=1.
  3. Walk pagination via X-WP-TotalPages from the response header.
  4. Parse each portfolio item:
    • title, content HTML
    • portfolio_entries term names β†’ split into categories vs locations
    • body regex β†’ apply email, external URL, closing date, salary (best-effort)
  5. Push one normalised row per item to the dataset.

βš™οΈ Input parameters

ParameterTypeDefaultDescription
startUrlsarray[]Direct portfolio-item / WP-JSON URLs. Empty = use entityTerms.
entityTermsarray["jobs"]portfolio_entries taxonomy slugs to scrape. Allowed: jobs, volunteering, get-support, get-involved, leadership, networks, advertise, podcast, membership.
enrichTaxonomiesbooleantrueWhen true, embeds taxonomy term names + featured image via WP-JSON _embed.
postedWithinHoursinteger(none)Only return rows posted in the last N hours (24 = last day, 72 = last 3 days). Empty/0 = all. Ideal for daily monitoring runs that only want fresh postings.
maxItemsinteger1000Hard cap on rows pushed.
maxConcurrency / minConcurrencyinteger5 / 1Parallel WP-JSON page-fetch limits.
maxRequestRetriesinteger5Retries before a failed request is given up.
proxyobjectNo proxySite does not anti-bot.

πŸ“Š Output overview

Each scraped item is one single dataset row. The type field is "job" when the item is in the "Jobs" category, else "post". The cpt field is always "portfolio".

πŸ“¦ Output sample

{
"type":"job",
"cpt":"portfolio",
"source":"communityfirstyorkshire.org.uk",
"jobId":"24490",
"slug":"north-yorkshire-adviser-to-unpaid-carers-veterans-carers-plus-yorkshire",
"jobUrl":"https://www.communityfirstyorkshire.org.uk/portfolio-item/north-yorkshire-adviser-to-unpaid-carers-veterans-carers-plus-yorkshire/",
"wpJsonUrl":"https://www.communityfirstyorkshire.org.uk/wp-json/wp/v2/portfolio/24490",
"title":"North Yorkshire: Adviser to Unpaid Carers (Veterans), Carers Plus Yorkshire",
"description":"<div>About the role…</div>",
"descriptionText":"About the role…",
"companyName":null,
"companyWebsite":"https://www.carersplus.net/",
"companyDomain":"carersplus.net",
"location":"North Yorkshire",
"locations":["North Yorkshire"],
"remote":false,
"salary":{
"currency":"GBP",
"min":24000,
"max":27000,
"raw":"Β£24,000 - Β£27,000 per annum"
},
"salaryRaw":"Β£24,000 - Β£27,000 per annum",
"categories":["Jobs"],
"employmentTypes":[],
"contractType":null,
"portfolioTerms":["Jobs","North Yorkshire"],
"status":"publish",
"postedDate":"2026-05-15T09:26:35Z",
"closingDate":"Friday 30 May 2026",
"modifiedDate":"2026-05-15T09:26:35Z",
"applyType":"email",
"applyUrl":"https://www.communityfirstyorkshire.org.uk/portfolio-item/north-yorkshire-adviser-to-unpaid-carers-veterans-carers-plus-yorkshire/",
"applyEmail":"recruitment@carersplus.net",
"externalApplyUrl":"https://www.carersplus.net/",
"featuredImageUrl":null,
"authorId":1,
"authorName":null,
"scrapedAt":"2026-05-20T00:13:00.000Z"
}

πŸ—‚ Key output fields

GroupFields
Identifierstype (job or post), cpt (always portfolio), source, jobId, slug, jobUrl, wpJsonUrl, scrapedAt
Contenttitle, description (HTML), descriptionText (plain)
DatespostedDate (ISO), closingDate (raw text), modifiedDate (ISO)
EmployercompanyName (null), companyWebsite (= externalApplyUrl), companyDomain
Locationlocation (primary, from portfolio_entries), locations[] (all), remote (true if 'Homeworking' tag present)
Compensationsalary.{currency, min, max, raw} (best-effort regex), salaryRaw
Taxonomiescategories[] (Jobs/Leadership/etc.), portfolioTerms[] (all term names)
Apply flowapplyType, applyUrl, applyEmail, externalApplyUrl

❓ FAQ

Why is closing date sometimes null even when the body mentions a deadline? The regex looks for "Closing date:", "Deadline:", or "Apply by:" prefixes. If the body uses other phrasing (e.g. "Applications must arrive by…"), the field stays null. The full body HTML is always in description.

Why is salary parse fragile? CFY items don't have a structured salary field β€” the regex hunts for "Β£" patterns in body text. Look at salaryRaw to see what was matched; if structured min/max look wrong, fall back to the raw string.

Can I scrape volunteering or events too? Yes. Set entityTerms: ["volunteering"] (or other term slugs). The same row shape applies β€” type becomes "post" for non-job categories.

Can I scrape private pages or applicant data? No. Only the public WP-JSON REST API.

How do I limit results? Set maxItems. With only ~6 jobs live, maxItems: 100 covers everything.

πŸ’¬ Support

πŸ›  Additional services

  • Custom output shape, additional fields, or one-off datasets: muhamed.didovic@gmail.com
  • Similar scrapers for other CVS / volunteer hubs (Doing Good Leeds, VA Rotherham, VAS Sheffield, Barnsley CVS, BCVS, York CVS): drop an email.
  • For API access (no Apify fee, just usage): muhamed.didovic@gmail.com

πŸ”Ž Explore more scrapers

See other scrapers at memo23's Apify profile β€” covering job boards, real estate, social media, and more.


⚠️ Disclaimer

This Actor is an independent tool and is not affiliated with, endorsed by, or sponsored by Community First Yorkshire (CFY), communityfirstyorkshire.org.uk, or any of their subsidiaries or affiliates. All trademarks mentioned are the property of their respective owners.

The scraper accesses only the publicly available WP-JSON REST endpoint and public detail pages on communityfirstyorkshire.org.uk β€” no authenticated endpoints, recruiter-only features, or content behind a login. Users are responsible for ensuring their use complies with communityfirstyorkshire.org.uk's Terms of Service, applicable data-protection law (GDPR, CCPA, etc.), and any contractual obligations of their own organisation.


SEO Keywords

community first yorkshire scraper, scrape communityfirstyorkshire.org.uk, cfy jobs api, yorkshire rural charity jobs scraper, north yorkshire voluntary sector jobs api, york charity jobs scraper, yorkshire dales charity recruitment data, Apify cfy, rural yorkshire jobs scraper, pateley bridge jobs api, yorkshire homeworking jobs scraper, wp-json portfolio cpt scraper, wordpress portfolio scraper, charityjob alternative scraper, doing good leeds alternative scraper, vassheffield alternative scraper, barnsleycvs alternative scraper, va rotherham alternative scraper, uk rural cvs jobs scraper, yorkshire third sector recruitment data

You might also like

Snicket Jobs Scraper

memo23/snicket-scraper

Scrape snicket.org β€” Bradford and West Yorkshire community-sector vacancies. RSS + labelled detail-page extraction: title, organisation, salary, hours, closing date, payment schedule, contact name/email/phone, full HTML description. JSON or CSV out, billed per result

πŸ‘ User avatar

Muhamed Didovic

2

VAS Sheffield Jobs Scraper

memo23/vassheffield-scraper

Scrape voluntary-sector vacancies from jobs.vas.org.uk (Voluntary Action Sheffield) via WP-JSON. Title, sector taxonomy, posted date, external apply URL extracted from listing body, full description. ~25 live vacancies in one request. JSON or CSV out, billed per result

πŸ‘ User avatar

Muhamed Didovic

2

Barnsley CVS Jobs Scraper

memo23/barnsleycvs-scraper

Scrape the barnsleycvs.org.uk Webflow job board for Barnsley voluntary-sector vacancies. Title, employer, parsed salary, closing date from the listing; full HTML description per job via optional detail enrichment. ~5-10 live vacancies. JSON or CSV out.

πŸ‘ User avatar

Muhamed Didovic

2

Doing Good Leeds Scraper

parsebird/doinggoodleeds-scraper

Scrape paid jobs, volunteering, events, and training courses from doinggoodleeds.org.uk via WP-JSON. Four entity types, ~178 items total. Title, employer, location, salary, apply email/URL, full description HTML. Export as JSON, CSV, Excel.

Doing Good Leeds Scraper

memo23/doinggoodleeds-scraper

Scrape paid jobs, volunteering, events, and training from doinggoodleeds.org.uk via WP-JSON. Pick any subset of 4 entity types. ~178 entities total. Title, employer, location, salary, apply email/URL, full description HTML inline per row. JSON or CSV out, billed per result.

πŸ‘ User avatar

Muhamed Didovic

20

s1jobs.com Scraper (/w EMAILS)

memo23/s1jobs-scraper

Scrape Scottish job postings (all sectors) from s1jobs.com β€” title, salary, employer, location, real lat/lng coordinates, posted/closing dates, full description, structured skills, and the actual recruiter apply URL or apply email. Works with any listing or vacancy URL. JSON or CSV out.

πŸ‘ User avatar

Muhamed Didovic

22

BuiltIn Jobs Scraper β€” Tech Jobs, Salaries & Companies

scrapesage/builtin-jobs-scraper

Scrape tech & startup jobs from BuiltIn β€” title, company, location(s), remote/hybrid, salary range, seniority, posted date and apply URL, with optional full descriptions. Search by keyword, filter and paginate. Monitor mode for new jobs. No login. Export JSON, CSV, Excel.

Highland Jobs (.scot) Scraper (/w EMAILS)

memo23/highlandjobs-scraper

crape every Scottish Highlands & Islands vacancy from highlandjobs.scot via the public WP-JSON API. Title, employer, location, salary (parsed), categories, posted date, full description HTML, apply email/URL. 204 jobs in ~3 requests. JSON or CSV out.

πŸ‘ User avatar

Muhamed Didovic

4

s1jobs Scraper – Scotland & UK Jobs, Salaries & Details

abotapi/s1jobs-com-scraper

Scrape jobs from s1jobs.com across Scotland and the UK. Search by keyword, location, or URLs. Returns title, parsed salary band, company, logo, GPS location, skills, contract type, and 90+ fields per job, with optional full description, postcode, and apply links.

Google Jobs Scraper

igview-owner/google-jobs-scraper

Search and scrape job listings from Google Jobs. Find jobs by query, location, and various filters with structured JSON output.

πŸ‘ User avatar

Sachin Kumar Yadav

327

1.0