VOOZH about

URL: https://apify.com/automation-lab/cms-medicare-provider-scraper

โ‡ฑ CMS Medicare Provider Scraper & Facility Data Extractor ยท Apify


Pricing

Pay per event

Go to Apify Store

CMS Medicare Provider Scraper

Extract CMS Medicare provider and facility datasets with identifiers, addresses, ratings, metadata, raw fields, and source API URLs.

Pricing

Pay per event

Rating

0.0

(0)

Developer

๐Ÿ‘ Stas Persiianenko

Stas Persiianenko

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

a day ago

Last modified

Categories

Share

Extract public Medicare provider and facility records from the CMS Provider Data API.

Use this Apify Actor to search CMS Provider Data datasets, fetch rows in bulk, normalize common provider/facility fields, and keep raw CMS columns for deeper analytics.

What does CMS Medicare Provider Scraper do?

CMS Medicare Provider Scraper is an API-first extractor for public datasets hosted on data.cms.gov/provider-data.

It helps you collect Medicare facility and provider records such as dialysis facilities, hospitals, home health agencies, hospices, nursing homes, quality measures, ratings, addresses, and contact fields when those columns exist in the selected CMS dataset.

The actor does not require a CMS login.

It uses the public CMS Provider Data metastore and datastore endpoints.

Who is it for?

๐Ÿฅ Healthcare sales teams use it to build facility and provider lead lists.

๐Ÿ“ Provider-directory teams use it to enrich addresses, phone numbers, identifiers, and CMS dataset context.

๐Ÿ“Š Market analysts use it to compare public Medicare facility coverage and quality attributes by state or dataset.

โœ… Compliance and diligence teams use it to preserve public CMS source fields with source URLs and timestamps.

๐Ÿงฉ Data engineers use it as a repeatable Apify job that exports clean JSON, CSV, Excel, or API results.

Why use this Medicare provider scraper?

CMS exposes many healthcare datasets, but the schemas vary.

This actor normalizes common columns into stable field names.

It also preserves the original CMS row under raw so you do not lose dataset-specific quality, rating, or operational columns.

You can search datasets by keyword or pin exact CMS dataset IDs.

You can limit rows per dataset to keep test runs cheap.

Source data

The source is CMS Provider Data.

Catalog endpoint:

https://data.cms.gov/provider-data/api/1/metastore/schemas/dataset/items

Row endpoint pattern:

https://data.cms.gov/provider-data/api/1/datastore/query/{datasetId}/0?offset={offset}&limit={limit}

CMS controls the dataset contents, update cadence, and column names.

What data can you extract?

The actor can extract rows from CMS Provider Data datasets that are available through the public datastore API.

Examples include facility listings, hospital quality datasets, dialysis datasets, state/national averages, and other provider-data resources.

You can start with the prefilled dialysis facility listing dataset ID 23ew-n7w9.

You can also search terms like hospital quality, home health, hospice, nursing facility, or provider.

Data table

FieldDescription
datasetIdCMS Provider Data dataset identifier
datasetTitleCMS dataset title
providerIdBest available provider/facility identifier
npiNational Provider Identifier when present
ccnCMS Certification Number when present
nameProvider, organization, hospital, or facility name
providerTypeProvider/facility type when present
specialtySpecialty/taxonomy when present
ownershipOwnership, chain, or profit status when present
ratingCommon quality/rating field when present
addressLine1Street address line 1
addressLine2Street address line 2
cityCity or town
stateState abbreviation
zipCodeZIP code
countyCounty/parish
phoneTelephone number
sourceApiUrlCMS API URL used for the row page
sourceRowIndexSource row index within the selected CMS dataset
scrapedAtISO timestamp for extraction
datasetOptional compact CMS dataset metadata
rawOptional original CMS row fields

How much does it cost to scrape CMS Medicare provider data?

This actor uses pay-per-event pricing.

There is a small run-start event and a per-row item event.

The per-row item event is formula-priced from cloud cost measurements. Current BRONZE per-row price is $0.000027457, with lower prices on higher Apify subscription tiers.

For small tests, keep maxRows low.

For production exports, raise maxRows and use exact datasetIds when you already know the CMS dataset you need.

Input options

datasetIds lets you provide exact CMS dataset identifiers.

datasetSearch finds datasets by terms in CMS metadata.

maxRows limits total rows saved across all selected datasets.

rowsPerDataset limits rows saved from each selected dataset.

pageSize controls CMS API request size.

state filters common state fields.

city filters common city/town fields.

nameContains filters common provider/facility name fields.

includeRawFields preserves all original CMS columns.

includeDatasetMetadata attaches CMS dataset title, description, keywords, and modified date.

Example input: exact dataset

{
"datasetIds":["23ew-n7w9"],
"maxRows":100,
"rowsPerDataset":100,
"pageSize":100,
"includeRawFields":true,
"includeDatasetMetadata":true
}

Example input: search datasets

{
"datasetSearch":"hospital quality",
"maxRows":250,
"rowsPerDataset":50,
"pageSize":100,
"state":"CA",
"includeRawFields":false,
"includeDatasetMetadata":true
}

Example output

{
"datasetId":"23ew-n7w9",
"datasetTitle":"Dialysis Facility - Listing by Facility",
"providerId":"012306",
"ccn":"012306",
"name":"CHILDRENS HOSPITAL DIALYSIS",
"ownership":"Non-profit",
"addressLine1":"1600 7TH AVENUE SOUTH",
"city":"BIRMINGHAM",
"state":"AL",
"zipCode":"35233",
"county":"Jefferson",
"phone":"(205) 638-9275",
"sourceApiUrl":"https://data.cms.gov/provider-data/api/1/datastore/query/23ew-n7w9/0?offset=0&limit=100",
"sourceRowIndex":0,
"scrapedAt":"2026-06-28T00:00:00.000Z"
}

How to use this actor

  1. Open the actor on Apify.

  2. Choose exact datasetIds or enter datasetSearch terms.

  3. Set maxRows and rowsPerDataset.

  4. Add optional filters such as state or nameContains.

  5. Run the actor.

  6. Export the dataset as JSON, CSV, Excel, XML, RSS, or HTML.

Tips for best results

Use exact CMS dataset IDs for repeatable production jobs.

Use search terms when discovering useful datasets.

Keep includeRawFields enabled if you need quality measures, rating details, dates, or dataset-specific columns.

Disable includeRawFields if you only need the normalized contact and identifier fields.

Start with a small maxRows to validate the dataset and filters.

Integrations

Send extracted rows to a CRM for healthcare lead enrichment.

Load rows into a data warehouse for provider network analysis.

Schedule recurring runs to monitor public CMS dataset updates.

Use webhooks to trigger downstream validation workflows after each run.

Combine results with your NPI, claims, licensing, or facility-reference data.

API usage with Node.js

import{ ApifyClient }from'apify-client';
const client =newApifyClient({token: process.env.APIFY_TOKEN});
const run =await client.actor('automation-lab/cms-medicare-provider-scraper').call({
datasetIds:['23ew-n7w9'],
maxRows:100,
includeRawFields:true
});
console.log(run.defaultDatasetId);

API usage with Python

from apify_client import ApifyClient
import os
client = ApifyClient(os.environ['APIFY_TOKEN'])
run = client.actor('automation-lab/cms-medicare-provider-scraper').call(run_input={
'datasetIds':['23ew-n7w9'],
'maxRows':100,
'includeRawFields':True,
})
print(run['defaultDatasetId'])

API usage with cURL

curl-X POST 'https://api.apify.com/v2/acts/automation-lab~cms-medicare-provider-scraper/runs?token=YOUR_APIFY_TOKEN'\
-H'Content-Type: application/json'\
-d'{"datasetIds":["23ew-n7w9"],"maxRows":100,"includeRawFields":true}'

MCP usage

Use the Apify MCP server with this actor when you want Claude or another MCP client to run CMS provider-data extraction.

MCP URL:

https://mcp.apify.com/?tools=automation-lab/cms-medicare-provider-scraper

Claude Code setup:

$claude mcp add apify-cms-medicare-provider-scraper https://mcp.apify.com/?tools=automation-lab/cms-medicare-provider-scraper

JSON MCP configuration:

{
"mcpServers":{
"apify-cms-medicare-provider-scraper":{
"url":"https://mcp.apify.com/?tools=automation-lab/cms-medicare-provider-scraper"
}
}
}

Example prompts:

  • "Run the CMS Medicare Provider Scraper for dialysis facility dataset 23ew-n7w9 and return 100 Alabama rows."
  • "Find CMS hospital quality datasets and export the first 250 rows with raw fields disabled."
  • "Extract Medicare provider/facility records for CA and summarize the available dataset titles."

Related scrapers

Use https://apify.com/automation-lab/npi-registry-provider-scraper when you specifically need NPI Registry records.

Use https://apify.com/automation-lab/fda-openfda-scraper when you need FDA/OpenFDA public datasets.

Use this actor when the source of truth should be CMS Provider Data datasets.

Legality and responsible use

CMS Provider Data is public government data.

You are responsible for using the extracted data in compliance with applicable laws, contracts, and internal policies.

Do not treat public records as a substitute for legal, compliance, or clinical advice.

Respect CMS source attribution and dataset documentation.

Troubleshooting

If no rows are returned, check whether your datasetIds are valid CMS Provider Data identifiers.

If search returns an unexpected dataset, use exact datasetIds instead of broad search terms.

If normalized fields are empty, inspect the raw object because CMS source schemas differ by dataset.

If filters remove too many rows, test without state, city, or nameContains first.

FAQ

Does this require a CMS account?

No. It uses public CMS Provider Data API endpoints.

Can I scrape all CMS Provider Data datasets?

Yes, but start with specific datasets and bounded maxRows values. CMS schemas vary and very large exports should be split into predictable jobs.

Why are some fields missing?

Different CMS datasets expose different columns. The actor fills normalized fields when matching source columns exist and can preserve all original fields under raw.

Can I filter by specialty?

The current actor normalizes specialty when present, but specialty filtering depends on the selected CMS dataset's columns. Use raw output to inspect exact field names.

Is this the same as an NPI Registry scraper?

No. It targets CMS Provider Data datasets. For NPI Registry-specific search, use an NPI-focused actor.

Changelog

Initial version extracts CMS Provider Data catalog-backed rows, normalizes common provider/facility fields, and preserves raw CMS columns.

You might also like

CMS Medicare PECOS Provider Enrollment Scraper

jungle_synthesizer/cms-medicare-pecos-enrollment-crawler

CMS Medicare provider enrollment: PECOS active enrollments, Opt-Out Affidavits, and Order/Refer authority. Filter by NPI, state, or specialty. Merged-by-NPI mode joins all three datasets. For pharma, medical-device, billing/RCM, and payer credentialing. Pairs with NPPES NPI crawler.

๐Ÿ‘ User avatar

BowTiedRaccoon

2

CMS Medicare Provider Utilization & Payment Crawler

jungle_synthesizer/cms-provider-utilization-crawler

Crawl Medicare provider data from the CMS Provider Data API. Access 2.8M+ clinician records with credentials, specialties, and addresses. Get utilization data, MIPS quality scores, and office visit costs. Filter by state, specialty, and provider name.

๐Ÿ‘ User avatar

BowTiedRaccoon

2

CMS Medicare Spending Scraper | MSPB Provider Data

parseforge/cms-data-medicare-spending-scraper

Export CMS Medicare Spending Per Beneficiary (MSPB) data at hospital, state and national level. Filter by US state. Pull facility ID, name, address, MSPB score and measurement period from the official data.cms.gov API. CSV, Excel, JSON or XML for healthcare research.

CMS Hospital General Information Scraper

compute-edge/cms-hospital-general-scraper

Extract CMS Hospital General Info for 5,400+ U.S. hospitals. Filter by state, hospital type, ownership, overall star rating, and emergency services. Includes addresses, phone, and mortality/safety/readmission measure counts.