VOOZH about

URL: https://apify.com/automation-lab/pubmed-search-scraper

⇱ PubMed Search Scraper: Articles, Abstracts & DOI Β· Apify


Pricing

from $0.01 / 1,000 pubmed article extracteds

Go to Apify Store

PubMed Search Scraper

Search PubMed via the official NCBI API and extract article metadata, abstracts, DOI, authors, journals, MeSH terms, and keywords.

Pricing

from $0.01 / 1,000 pubmed article extracteds

Rating

0.0

(0)

Developer

πŸ‘ Stas Persiianenko

Stas Persiianenko

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

2 days ago

Last modified

Share

Extract PubMed article metadata, abstracts, DOI, authors, journals, MeSH terms, keywords, and publication dates from the official NCBI E-utilities API.

Use this actor when you need a repeatable PubMed literature-monitoring pipeline for biomedical research, pharma intelligence, systematic reviews, clinical-trial landscaping, academic discovery, or RAG dataset preparation.

What does PubMed Search Scraper do?

PubMed Search Scraper turns one or more PubMed search queries into a clean Apify dataset.

It uses NCBI ESearch, ESummary, and EFetch XML endpoints.

It does not scrape PubMed HTML pages.

It does not require login cookies.

It does not require browser automation.

It can run with no NCBI API key for normal public access.

Add an optional NCBI API key only when you need higher throughput.

Who is it for?

  • 🧬 Medical researchers tracking new papers for a topic.
  • πŸ’Š Pharma and biotech analysts monitoring drug, disease, biomarker, and target literature.
  • πŸ₯ Clinical evidence teams building review queues.
  • πŸŽ“ Academic labs collecting citation metadata for literature reviews.
  • πŸ€– AI and RAG teams preparing biomedical document indexes.
  • πŸ“ˆ Competitive-intelligence teams watching publications by disease area, journal, or author keyword.
  • 🧾 Systematic-review teams exporting article metadata before screening.

Why use this actor?

PubMed search results are easy to inspect manually but hard to operationalize at scale.

This actor gives you structured rows with stable identifiers and metadata that are ready for export.

You can schedule it daily or weekly to monitor new papers.

You can send the dataset to Google Sheets, S3, Make, Zapier, or your own database.

You can use PubMed query syntax directly, including field tags such as [Title] or [MeSH Terms].

Data you can extract

FieldDescription
pmidPubMed identifier
titleArticle title
abstractAbstract text when available
journalJournal name
journalIssnISSN from PubMed XML when available
publicationDatePublication date
epubDateElectronic publication date from ESummary
authorsStructured author objects with affiliations when available
authorNamesFlat author-name list
doiDigital Object Identifier
articleTypesPublication types such as Review or Clinical Trial
meshTermsMeSH descriptor terms
keywordsAuthor keywords
languagePubMed language code
urlPubMed article URL
queryInput query that produced the article
rankResult rank within the query
totalResultsForQueryTotal PubMed matches reported by ESearch

How much does it cost to scrape PubMed search results?

This actor uses pay-per-event pricing.

You pay a small start fee plus a per-result fee for each PubMed article saved to the dataset.

The default input is intentionally small so your first run is cheap.

Large literature reviews should increase maxResultsPerQuery after you confirm the query is correct.

The actor uses the public NCBI API and no proxies, so platform costs are kept low.

How to use PubMed Search Scraper

  1. Open automation-lab/pubmed-search-scraper on Apify.
  2. Enter one or more PubMed queries.
  3. Choose how many articles to save per query.
  4. Optionally set a date range.
  5. Optionally restrict by article type or journal.
  6. Decide whether to include abstracts, MeSH terms, and keywords.
  7. Run the actor.
  8. Export the dataset as JSON, CSV, Excel, XML, RSS, or through the Apify API.

Input example

{
"queries":[
"cancer immunotherapy",
"machine learning radiology"
],
"maxResultsPerQuery":100,
"sort":"pub_date",
"minDate":"2024/01/01",
"articleTypes":["Review"],
"includeAbstract":true,
"includeMeshTerms":true,
"requestsPerSecond":3
}

Output example

{
"pmid":"42345602",
"title":"Early neutrophil infiltration promotes TRIMELVax-induced antitumor immunity...",
"abstract":"Enhancing innate-adaptive immune crosstalk is key...",
"journal":"Oncoimmunology",
"publicationDate":"2026-Dec-31",
"authors":[{"name":"Amarilis PΓ©rez-BaΓ±os"}],
"doi":"10.1080/2162402X.2026.2680766",
"articleTypes":["Journal Article"],
"meshTerms":["Animals","Neutrophils"],
"keywords":["Immunotherapy","cancer vaccine"],
"url":"https://pubmed.ncbi.nlm.nih.gov/42345602/",
"query":"cancer immunotherapy",
"rank":1,
"source":"PubMed"
}

PubMed query tips

Use normal PubMed query syntax.

Examples:

  • cancer immunotherapy
  • CRISPR[Title]
  • "machine learning"[MeSH Terms]
  • diabetes AND metformin
  • Nature Medicine[Journal] AND oncology
  • COVID-19 vaccine AND randomized controlled trial

Keep your first run small.

Check the output.

Then increase maxResultsPerQuery for production use.

Date and article-type filtering

Use minDate and maxDate to monitor new papers.

Use dateType to decide which PubMed date field is filtered.

Use articleTypes for publication types such as:

  • Review
  • Clinical Trial
  • Randomized Controlled Trial
  • Meta-Analysis
  • Systematic Review
  • Case Reports

Use journals to restrict results to specific journals.

Integrations

This actor works well in automated research workflows.

  • πŸ” Schedule daily searches for new biomedical papers.
  • 🧾 Export CSV for review-screening tools.
  • πŸ“Š Send article metadata to Google Sheets.
  • 🧠 Feed abstracts and MeSH terms into RAG pipelines.
  • πŸ—„οΈ Store PMIDs and DOI values in a data warehouse.
  • πŸ”” Trigger alerts when new papers match high-value disease or drug queries.

API usage with Node.js

import{ ApifyClient }from'apify-client';
const client =newApifyClient({token: process.env.APIFY_TOKEN});
const run =await client.actor('automation-lab/pubmed-search-scraper').call({
queries:['cancer immunotherapy'],
maxResultsPerQuery:50,
sort:'pub_date',
includeAbstract:true
});
const{ items }=await client.dataset(run.defaultDatasetId).listItems();
console.log(items);

API usage with Python

from apify_client import ApifyClient
client = ApifyClient('YOUR_APIFY_TOKEN')
run = client.actor('automation-lab/pubmed-search-scraper').call(run_input={
'queries':['machine learning radiology'],
'maxResultsPerQuery':50,
'sort':'pub_date',
'includeAbstract':True,
})
items = client.dataset(run['defaultDatasetId']).list_items().items
print(items)

API usage with cURL

curl-X POST 'https://api.apify.com/v2/acts/automation-lab~pubmed-search-scraper/runs?token=YOUR_APIFY_TOKEN'\
-H'Content-Type: application/json'\
-d'{"queries":["CRISPR[Title]"],"maxResultsPerQuery":25,"includeAbstract":true}'

MCP: use PubMed Search Scraper from Claude

You can call this actor through Apify MCP from Claude Code or Claude Desktop.

MCP server URL:

https://mcp.apify.com/?tools=automation-lab/pubmed-search-scraper

Claude Code setup:

$claude mcp add apify-pubmed-search https://mcp.apify.com/?tools=automation-lab/pubmed-search-scraper

Claude Desktop JSON config:

{
"mcpServers":{
"apify-pubmed-search":{
"url":"https://mcp.apify.com/?tools=automation-lab/pubmed-search-scraper"
}
}
}

Example prompts:

  • "Search PubMed for 50 recent review articles about CAR-T adverse events and summarize the journal distribution."
  • "Run the PubMed scraper for machine learning radiology since 2024 and return DOI, title, journal, and abstracts."
  • "Find recent PubMed papers about GLP-1 cardiovascular outcomes and prepare a screening table."

NCBI API key and rate limits

NCBI E-utilities works without an API key for normal public use.

Without an API key, the actor caps requests at a conservative 3 requests per second.

With an API key, you can set a higher requestsPerSecond value up to 10.

Batches are used for summaries and detail XML to reduce request count.

FAQ

Does this PubMed scraper require an API key?

No. It works with the public NCBI E-utilities API. Add an optional API key only if you need higher request throughput.

Does it download full-text articles?

No. It extracts PubMed citation metadata and abstracts available through PubMed XML. It does not bypass publisher paywalls.

Troubleshooting

Why did I get zero results?

Your query may be too narrow, the date range may exclude all records, or a publication type/journal filter may not match PubMed indexing.

Try the query in PubMed directly, remove filters, and rerun with a small limit.

Why is an abstract missing?

Not every PubMed record has an abstract in the XML response.

If PubMed does not provide an abstract, the abstract field is omitted or empty.

Why are some MeSH terms missing?

Fresh records may not have MeSH indexing yet.

PubMed indexing can lag behind publication.

Legality and responsible use

This actor uses NCBI's public E-utilities API.

It does not bypass login, paywalls, or private systems.

Respect NCBI usage guidelines and keep request rates reasonable.

If you run large scheduled workflows, provide an NCBI API key and contact email.

Related scrapers

Explore related Automation Lab actors:

Best practices

Start with one query.

Use maxResultsPerQuery around 25 for validation.

Export the dataset and inspect fields.

Then increase volume or add more queries.

Use PubMed field tags when you need precision.

Use scheduled runs for monitoring new publications.

Store PMIDs so downstream systems can deduplicate records.

Support

If a run fails, include the run ID, input JSON, and a short description of what you expected.

For query-quality questions, include the exact PubMed query and the date filters you used.

You might also like

PubMed Search Scraper

easyapi/pubmed-search-scraper

Scrape research papers and academic articles from PubMed based on search terms. Extract comprehensive article metadata including titles, authors, citations, abstracts, and more. Perfect for medical research and literature reviews.

PubMed Search Scraper

crawlerbros/pubmed-search-scraper

Search PubMed (NCBI E-utilities) for biomedical articles by keyword, date range, and article type. Returns title, authors, journal, abstract, DOI, MeSH terms, keywords, and citation. Free public API, no proxy, no cookies. Optional NCBI API key for higher rate limits.

DentalPlans.com Dentist Scraper 🦷

easyapi/dentalplans-com-dentist-scraper

Extract detailed dentist information from DentalPlans.com search results, including practice details, contact info, and appointment availability. Perfect for healthcare research, provider analysis, and dental market insights. 🦷

32

5.0

(1)

Article Content Extractor πŸ“„

easyapi/article-content-extractor

Extract clean article content, metadata and structured information from any web page. Supports multiple URLs and returns well-formatted JSON with title, description, content, author, publish date and more. πŸ”πŸ“„

arXiv Search Scraper πŸ“š

easyapi/arxiv-search-scraper

Extract comprehensive research paper data from arXiv search results. Get detailed metadata including titles, authors, abstracts, categories and more. Perfect for academic research monitoring, trend analysis and building paper databases. πŸŽ“πŸ“š

πŸ€– Any Website URL to Article Summarizer

easyapi/any-website-url-to-article-summarizer

Transform any article, blog post, or web content into concise, AI-powered summaries. Get key insights and main points instantly with smart text analysis and markdown formatting. Perfect for researchers, content creators, and busy professionals who need quick, accurate content digests.

AI Content Detector πŸ”

easyapi/ai-content-detector

πŸ€– Analyze text content to determine if it's AI-generated with high accuracy. Get detailed probability analysis and authoritative conclusions about content authenticity. Perfect for content verification, academic integrity, and digital publishing quality control.

πŸ€– Trading Performance Analyst

easyapi/trading-performance-analyst

Transform your trading performance data into actionable insights with AI! Get comprehensive analysis of your trading metrics, strategies, and risk management approach. Perfect for day traders, investors, and portfolio managers seeking data-driven recommendations to optimize their trading. πŸ“ŠπŸš€

Nature Search Results Scraper πŸ”¬

easyapi/nature-search-results-scraper

Extract comprehensive research article data from Nature.com search results. Automatically scrape article details, author information, metadata, and preview images. Perfect for research monitoring, trend analysis, and building scientific literature databases. πŸ”¬πŸ“š

PubMed Scraper

labrat011/pubmed-scraper

Search 35M+ medical citations from PubMed/MEDLINE. Extract articles, abstracts, authors, MeSH terms, and citations for research, competitive intelligence, or AI/RAG pipelines. No API key required.

πŸ€– Trading Performance Analyst

easyapi/trading-performance-analyst-1

Transform your trading performance data into actionable insights with AI! Get comprehensive analysis of your trading metrics, strategies, and risk management approach. Perfect for day traders, investors, and portfolio managers seeking data-driven recommendations to optimize their trading. πŸ“ŠπŸš€