VOOZH about

URL: https://apify.com/memo23/bcvs-scraper

โ‡ฑ Bassetlaw CVS Jobs Scraper ยท Apify


Pricing

from $1.99 / 1,000 results

Go to Apify Store

Bassetlaw CVS Jobs Scraper

Scrape bcvs.org.uk (Bassetlaw + Bolsover Drupal 10 site). HTML scrape captures title, body, closing date, contact email, AND the job-spec PDF attachment URL for downstream OCR. ~3-10 live vacancies. No anti-bot. JSON or CSV out, billed per result.

Pricing

from $1.99 / 1,000 results

Rating

0.0

(0)

Developer

๐Ÿ‘ Muhamed Didovic

Muhamed Didovic

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

16 days ago

Last modified

Categories

Share

Scrape the Bassetlaw Community & Voluntary Service jobs board at bcvs.org.uk/latest-vacancies. Drupal 10 site with REST API disabled โ€” the actor HTML-scrapes the listing for /job/<slug> URLs and follows each detail page for body text, closing date, contact email/website, and the job-spec PDF attachment URL. JSON or CSV out, no compute charge per run, just per result.

How it works

๐Ÿ‘ How Bassetlaw CVS Scraper works

โœจ Why use this scraper?

BCVS hosts the voluntary-sector jobs board for Bassetlaw and Bolsover (north Nottinghamshire / Derbyshire). Tracking the local third-sector? Building a CVS network dashboard? Sourcing for paid roles at local charities?

  • ๐ŸŽฏ Two starting points. The /latest-vacancies listing URL (default) or any direct /job/<slug> URL.
  • โšก Single HTTP call for the listing. Drupal renders all 3-10 visible jobs in one SSR'd page.
  • ๐Ÿ“‹ Detail-page enrichment. One-fetch-per-job extracts title (h1), body HTML, closing date (field-closing-date), contact email + website.
  • ๐Ÿ“„ PDF attachment URL captured. Real job specs live in PDFs at field-person-specification. We extract the URL + filename so downstream consumers can fetch / index the PDF separately.
  • ๐Ÿ‡ฌ๐Ÿ‡ง Bassetlaw + Bolsover focus. Member charities โ€” Citizens Advice, parish councils, support services, faith-based orgs.
  • ๐Ÿ“ค Clean exports. One row per vacancy. JSON + CSV exported automatically.

๐ŸŽฏ Use cases

TeamWhat they build
Local CVS networkCross-region nonprofit hiring intelligence in north Notts / north Derbyshire
Sector recruitersDaily new-vacancy feeds for Bassetlaw / Bolsover charities
ResearchersLocal third-sector labour-market datasets
PDF indexing servicesAuto-collect job-spec PDFs into searchable archives
Workforce strategySalary intelligence (post-PDF-OCR) across local charities

๐Ÿ“ฅ Supported inputs

URL patternBehaviour
https://www.bcvs.org.uk/latest-vacanciesFull listing (default)
https://www.bcvs.org.uk/job/<slug>Single job โ€” fetches detail page directly

Leave startUrls empty for the full listing.

Not supported: REST API access (Drupal /jsonapi is disabled); hosts outside bcvs.org.uk.

๐Ÿ”„ How it works

  1. Fetch /latest-vacancies โ€” Drupal HTML page (~33 KB).
  2. Harvest /job/<slug> anchors โ€” typically 3-10 jobs visible.
  3. For each (when enrichDetail: true), fetch the detail page.
  4. Parse Drupal field classes:
    • <h1> โ†’ title
    • .field--name-body โ†’ body HTML + plain text
    • .field--name-field-closing-date โ†’ closing date
    • .field--name-field-person-specification a[href$=".pdf"] โ†’ job-spec PDF URL
  5. Extract contact email + website from body (first mailto: and first non-bcvs http href).
  6. Push merged row with original listing + detail enrichment.

โš™๏ธ Input parameters

ParameterTypeDefaultDescription
startUrlsarray["https://www.bcvs.org.uk/latest-vacancies"]Listing URL or single-job URLs. Empty = listing.
enrichDetailbooleantrueWhen true, fetches each detail page. Disable for listing-only output (title + URL only).
maxItemsinteger1000Hard cap on rows pushed (typically 3-10 live).
maxConcurrencyinteger3Parallel detail-page fetch limit.
maxRequestRetriesinteger5Retries before a failed request is given up.
proxyobjectNo proxySite does not anti-bot.

๐Ÿ“Š Output overview

Each scraped vacancy is one single dataset row of type: "job". Listing fields (slug, URL, anchor title) merged with optional detail-page enrichment (body, closing date, PDF URL, contact info).

๐Ÿ“ฆ Output sample

{
"type":"job",
"source":"bcvs.org.uk",
"jobId":"clowne-parish-council-assistant-clerk",
"slug":"clowne-parish-council-assistant-clerk",
"jobUrl":"https://www.bcvs.org.uk/job/clowne-parish-council-assistant-clerk",
"title":"Clowne Parish Council - Assistant Clerk",
"description":"<p>Contact: bcvs@bcvs.org.ukโ€ฆ</p>",
"descriptionText":"Contact: bcvs@bcvs.org.ukโ€ฆ",
"companyName":null,
"companyWebsite":null,
"companyDomain":null,
"location":"Bassetlaw / Bolsover, England",
"remote":false,
"salary":null,
"salaryRaw":null,
"categories":[],
"employmentTypes":[],
"contractType":null,
"status":"publish",
"postedDate":null,
"closingDate":null,
"modifiedDate":null,
"applyType":"email",
"applyUrl":"https://www.bcvs.org.uk/job/clowne-parish-council-assistant-clerk",
"applyEmail":"bcvs@bcvs.org.uk",
"externalApplyUrl":null,
"jobSpecPdfUrl":"https://www.bcvs.org.uk/sites/default/files/2026-05/job%20advert%20final.pdf",
"jobSpecPdfFilename":"job advert final.pdf",
"scrapedAt":"2026-05-20T00:13:00.000Z"
}

๐Ÿ—‚ Key output fields

GroupFields
Identifierstype, source, jobId (= slug), slug, jobUrl, scrapedAt
Contenttitle (from h1), description (HTML, from detail page), descriptionText (plain)
DatesclosingDate (from field-closing-date)
EmployercompanyName (null โ€” H1 typically reads "Employer - Role"), companyWebsite, companyDomain
Locationlocation (always "Bassetlaw / Bolsover, England")
Apply flowapplyType (email/external/internal), applyUrl, applyEmail, externalApplyUrl
BCVS-specificjobSpecPdfUrl, jobSpecPdfFilename

โ“ FAQ

Why is companyName always null? BCVS's Drupal field structure doesn't separate employer name from job title โ€” the H1 reads "

Why is salary always null? Salary information lives in the PDF attachment, which we don't parse. Use jobSpecPdfUrl to fetch + parse the PDF yourself (libraries like pdf-parse / pdfjs-dist work well).

Why is closingDate sometimes null? Not all charities fill in the field-closing-date Drupal field โ€” it's optional. Closing dates often appear in the PDF instead.

Can I scrape private pages or applicant data? No. Only the public /latest-vacancies listing and public /job/<slug> pages.

How do I limit results? Set maxItems. With only 3-10 live vacancies, maxItems: 100 covers everything safely.

๐Ÿ’ฌ Support

๐Ÿ›  Additional services

  • Custom output shape, additional fields, or one-off datasets: muhamed.didovic@gmail.com
  • Bundled PDF-spec OCR for a richer dataset (extracts salary, hours, role description from the PDF): drop an email.
  • Similar scrapers for other CVS / volunteer hubs (Doing Good Leeds, VA Rotherham, VAS Sheffield, Barnsley CVS, Community First Yorkshire): drop an email.

๐Ÿ”Ž Explore more scrapers

See other scrapers at memo23's Apify profile โ€” covering job boards, real estate, social media, and more.


โš ๏ธ Disclaimer

This Actor is an independent tool and is not affiliated with, endorsed by, or sponsored by Bassetlaw Community & Voluntary Service (BCVS), bcvs.org.uk, or any of their subsidiaries or affiliates. All trademarks mentioned are the property of their respective owners.

The scraper accesses only the publicly available /latest-vacancies listing page and public /job/<slug> detail pages on bcvs.org.uk โ€” no authenticated endpoints, recruiter-only features, or content behind a login. Users are responsible for ensuring their use complies with bcvs.org.uk's Terms of Service, applicable data-protection law (GDPR, CCPA, etc.), and any contractual obligations of their own organisation.


SEO Keywords

bcvs scraper, scrape bcvs.org.uk, bassetlaw cvs scraper, bolsover cvs scraper, bassetlaw voluntary sector jobs api, north nottinghamshire charity jobs scraper, north derbyshire charity jobs scraper, Apify bcvs, drupal scraper, drupal 10 html scraper, drupal node scraper, pdf job spec extractor, charity jobs pdf scraper, charityjob alternative scraper, doing good leeds alternative scraper, vassheffield alternative scraper, barnsleycvs alternative scraper, va rotherham alternative scraper, uk cvs jobs scraper, local nonprofit recruitment data

You might also like

Barnsley CVS Jobs Scraper

memo23/barnsleycvs-scraper

Scrape the barnsleycvs.org.uk Webflow job board for Barnsley voluntary-sector vacancies. Title, employer, parsed salary, closing date from the listing; full HTML description per job via optional detail enrichment. ~5-10 live vacancies. JSON or CSV out.

๐Ÿ‘ User avatar

Muhamed Didovic

2

Snicket Jobs Scraper

memo23/snicket-scraper

Scrape snicket.org โ€” Bradford and West Yorkshire community-sector vacancies. RSS + labelled detail-page extraction: title, organisation, salary, hours, closing date, payment schedule, contact name/email/phone, full HTML description. JSON or CSV out, billed per result

๐Ÿ‘ User avatar

Muhamed Didovic

2

Community First Yorkshire Jobs Scraper

memo23/cfy-scraper

Scrape jobs and other portfolio content from communityfirstyorkshire.org.uk via WP-JSON portfolio CPT. Filter by taxonomy (default jobs โ‰ˆ 6 vacancies). Title, full HTML, location, apply email/URL, best-effort closing date + salary regex. JSON or CSV out.

๐Ÿ‘ User avatar

Muhamed Didovic

2

VAS Sheffield Jobs Scraper

memo23/vassheffield-scraper

Scrape voluntary-sector vacancies from jobs.vas.org.uk (Voluntary Action Sheffield) via WP-JSON. Title, sector taxonomy, posted date, external apply URL extracted from listing body, full description. ~25 live vacancies in one request. JSON or CSV out, billed per result

๐Ÿ‘ User avatar

Muhamed Didovic

2

York CVS Jobs Scraper

memo23/yorkcvs-scraper

Scrape yorkcvs.livevacancies.co.uk โ€” Vue SPA hosted by hireful, but NO browser needed. Actor reverse-engineered the public JSON API. Each row: title, salary, location, full description HTML, ISO closing date, region, contract type, hours. JSON or CSV out.

๐Ÿ‘ User avatar

Muhamed Didovic

2

Drupal.org Modules Scraper

crawlerbros/drupal-org-modules-scraper

Scrape Drupal.org modules with search by keyword, browse by category, fetch trending by install count, or look up specific module IDs. Returns title, machine name, install count, compatible Drupal versions, author, categories, and more.

VA Rotherham Jobs Scraper

memo23/varotherham-scraper

Scrape the varotherham.org.uk South Yorkshire voluntary-sector job board (Wix CMS). One HTTP request, every job inline: title, employer, location, closing date. Rotherham / Barnsley / Doncaster / Sheffield charities. JSON or CSV out, billed per result.

๐Ÿ‘ User avatar

Muhamed Didovic

2

Reed.co.uk Jobs Scraper

crawlergang/reed-jobs-scraper

Scrape job listings from Reed.co.uk - the UK's #1 job site with 250K+ live vacancies. Search by keywords and location. Returns structured job data including title, company, salary, location, job type, remote status, and description.

2

5.0

Reed.co.uk Jobs Scraper

crawlerbros/reed-jobs-scraper

Scrape job listings from Reed.co.uk - the UK's #1 job site with 250K+ live vacancies. Search by keywords and location. Returns structured job data including title, company, salary, location, job type, remote status, and description.

OCR Structured Extractor (AI) โ€” Image/PDF โ†’ OCR Text + JSON

macheta/ocr-structured-extractor

Extract OCR text and structured JSON from an image or PDF URL. Great for invoices, receipts, forms, IDs, and tables. Powered by Gemini 3 Pro.