VOOZH about

URL: https://apify.com/crawlerbros/pubchem-chemical-compound-scraper

โ‡ฑ PubChem Chemical Compound Scraper ยท Apify


Pricing

from $3.00 / 1,000 results

Go to Apify Store

PubChem Chemical Compound Scraper

Search PubChem - the world's largest free chemistry database with 100M+ compounds. Search by name, get by CID, or fetch synonyms. Returns molecular formula, weight, SMILES, InChI, logP, H-bond counts, and more. No API key required.

Pricing

from $3.00 / 1,000 results

Rating

0.0

(0)

Developer

๐Ÿ‘ Crawler Bros

Crawler Bros

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

19 days ago

Last modified

Share

Scrape PubChem โ€” the world's largest free chemistry database with 100M+ compounds maintained by the NCBI. Search by compound name or PubChem CID, fetch detailed molecular properties, or retrieve all known synonyms. HTTP-only via the public PubChem REST API. No API key, no proxy required.

What this actor does

  • Four modes: searchCompounds, getByName, getByCID, getSynonyms
  • Name-based search: returns multiple matching compounds for a query term
  • Exact lookup: get full detail for a specific compound name or CID
  • Synonyms: retrieve all known names and identifiers for any compound
  • Rich properties: molecular formula, weight, SMILES, InChI, InChIKey, XLogP, H-bond counts, heavy atom count, complexity, charge
  • Empty fields are omitted โ€” no nulls in output

Modes

ModeDescription
searchCompoundsSearch by keyword โ€” returns multiple matching compounds
getByNameGet detailed info for an exact compound name
getByCIDGet compound by PubChem CID number
getSynonymsGet all known synonyms for a compound

Input

FieldTypeDescription
modeselectWhich mode to use (default: searchCompounds)
compoundNamestringCompound name to search or look up (e.g. aspirin, caffeine)
cidintegerPubChem CID for getByCID or getSynonyms mode
maxItemsintegerMaximum records to return, 1โ€“200 (default: 20)

Output per compound

FieldTypeDescription
cidintegerPubChem Compound ID
iupacNamestringIUPAC systematic name
commonNamestringCommon/trade name (first synonym or provided name)
molecularFormulastringMolecular formula (e.g. C9H8O4)
molecularWeightfloatMolecular weight in g/mol
canonicalSmilesstringCanonical SMILES notation
inchiKeystringStandard InChIKey hash
xLogPfloatComputed XLogP3 lipophilicity
hBondDonorCountintegerNumber of hydrogen bond donors
hBondAcceptorCountintegerNumber of hydrogen bond acceptors
rotatableBondCountintegerNumber of rotatable bonds
heavyAtomCountintegerNumber of heavy (non-hydrogen) atoms
complexityfloatMolecular complexity score
chargeintegerFormal charge of the compound
synonymsarrayTop 5 known synonyms
pubchemUrlstringDirect link to PubChem compound page
scrapedAtstringISO 8601 timestamp of when the record was scraped

Data source

PubChem is a free chemistry database maintained by the National Center for Biotechnology Information (NCBI), part of the US National Institutes of Health. The PubChem REST API is completely free with no registration required โ€” rate limited to 5 requests/second.

Example output

{
"cid":2244,
"iupacName":"2-(acetyloxy)benzoic acid",
"commonName":"aspirin",
"molecularFormula":"C9H8O4",
"molecularWeight":180.16,
"canonicalSmiles":"CC(=O)Oc1ccccc1C(=O)O",
"inchiKey":"BSYNRYMUTXBXSQ-UHFFFAOYSA-N",
"xLogP":1.2,
"hBondDonorCount":1,
"hBondAcceptorCount":4,
"rotatableBondCount":3,
"heavyAtomCount":13,
"complexity":212,
"charge":0,
"synonyms":["aspirin","acetylsalicylic acid","2-acetoxybenzoic acid","ASA","Ecotrin"],
"pubchemUrl":"https://pubchem.ncbi.nlm.nih.gov/compound/2244",
"scrapedAt":"2026-06-03T10:00:00+00:00"
}

FAQs

Do I need an API key? No. The PubChem REST API is completely free and requires no registration.

How many results can I get? Up to 200 compounds per run. The PubChem database contains over 100 million compounds.

What is the rate limit? PubChem allows up to 5 requests per second. This actor respects that limit automatically.

Can I look up by SMILES or InChI? Use getByName mode โ€” PubChem's name endpoint also accepts SMILES strings and InChI identifiers.

What if a compound has no IUPAC name? Fields are only included when data is available. If an IUPAC name is missing, only the commonName (from synonyms) will appear.

Is the data current? PubChem is updated continuously by the NCBI. Data accuracy reflects the current PubChem database state.

You might also like

PubChem Compound Scraper

crawlerbros/pubchem-scraper

Scrape PubChem - the world's largest free chemistry database with 100M+ compounds. Search by name, CID, SMILES, or full-text. Returns molecular formula, weight, SMILES, InChI, logP, H-bond counts, synonyms, and more.

PubChem Compound Scraper

crawlergang/pubchem-scraper

Scrape PubChem - the world's largest free chemistry database with 100M+ compounds. Search by name, CID, SMILES, or full-text. Returns molecular formula, weight, SMILES, InChI, logP, H-bond counts, synonyms, and more.

1

5.0

PubChem Compound Lookup โ€” Chemistry API for Pharma R&D

azureblue/pubchem-compound-scraper

Look up chemical compounds in PubChem by name. Returns CID, molecular formula, weight, SMILES, InChI, IUPAC name, physicochemical properties, description and synonyms.

PubChem Compound Scraper

parseforge/pubchem-compound-scraper

Export chemical compound data from PubChem, the world's largest open chemistry database with 119M+ compounds. Look up by CID, name, SMILES, or InChIKey. Pull molecular formulas, weights, structures, synonyms, IUPAC names, and properties.

PubChem Compound Scraper - Chemical & Drug Data API

pink_comic/pubchem-compound-search

Scrape NIH PubChem chemical compound data by name, formula, SMILES, or CID. Get molecular weight, IUPAC, InChI, SMILES, XLogP, synonyms, and drug data for pharma, toxicology, and R&D workflows.

MyChem.info Drug Annotation Scraper

parseforge/mychem-drug-annotation-scraper

Resolve any drug name or InChIKey into a tidy annotation from MyChem.info. Returns DrugBank name and accession, ChEMBL and PubChem ids, UNII, ATC codes, chemical formula, molecular weight, indications, and mechanism classes. Great for drug reference tables and identifier crosswalks.

ChEMBL Molecules Scraper

parseforge/chembl-molecules-scraper

Scrape molecules from EBI ChEMBL public API including SMILES, InChI, molecular properties (MW, logP, HBA, HBD, PSA, RTB), max phase, ATC classifications, oral/parenteral/topical flags, first approval, black box warning, prodrug and withdrawn flag. No API key required.