VOOZH about

URL: https://apify.com/pink_comic/pubchem-compound-search

โ‡ฑ PubChem Scraper - Chemical Compound & Drug Data API ยท Apify


๐Ÿ‘ PubChem Compound Scraper - Chemical & Drug Data API avatar

PubChem Compound Scraper - Chemical & Drug Data API

Pricing

from $2.00 / 1,000 results

Go to Apify Store

PubChem Compound Scraper - Chemical & Drug Data API

Scrape NIH PubChem chemical compound data by name, formula, SMILES, or CID. Get molecular weight, IUPAC, InChI, SMILES, XLogP, synonyms, and drug data for pharma, toxicology, and R&D workflows.

Pricing

from $2.00 / 1,000 results

Rating

0.0

(0)

Developer

๐Ÿ‘ Ava Torres

Ava Torres

Maintained by Community

Actor stats

0

Bookmarked

4

Total users

1

Monthly active users

14 days ago

Last modified

Share

PubChem Compound Search

Search 115M+ chemical compounds from NIH PubChem. Look up compounds by name, molecular formula, SMILES notation, or PubChem CID. Returns molecular properties including weight, IUPAC name, canonical SMILES, InChI, XLogP, TPSA, hydrogen bond counts, and more. No API key required.


Data Source

NIH PubChem PUG REST API (pubchem.ncbi.nlm.nih.gov). PubChem is the world's largest publicly accessible chemical database, maintained by the National Center for Biotechnology Information (NCBI) as part of the National Institutes of Health.


Output Fields

Output fields depend on the selected Property Set.

Basic Properties

FieldTypeDescription
cidintegerPubChem Compound ID
molecularFormulastringMolecular formula (e.g., C9H8O4)
molecularWeightnumberMolecular weight in g/mol
canonicalSMILESstringCanonical SMILES notation
isomericSMILESstringIsomeric SMILES notation
iupacNamestringIUPAC systematic name
inchistringInChI identifier
inchiKeystringHashed InChIKey (27 characters)

Physical Properties (adds to Basic)

FieldTypeDescription
xLogPnumberOctanol-water partition coefficient
exactMassnumberExact monoisotopic mass
tpsanumberTopological polar surface area (A^2)
hBondDonorCountintegerNumber of hydrogen bond donors
hBondAcceptorCountintegerNumber of hydrogen bond acceptors
rotatableBondCountintegerNumber of rotatable bonds

All Properties (adds to Physical)

FieldTypeDescription
monoisotopicMassnumberMonoisotopic mass
heavyAtomCountintegerNumber of non-hydrogen atoms
complexitynumberStructural complexity score
chargeintegerFormal charge

Use Cases

  • Drug discovery and cheminformatics -- retrieve molecular properties for compound screening, ADMET analysis, or structure-activity relationship (SAR) research.
  • Chemical database integration -- pull structured compound data into internal databases, ELN systems, or research workflows.
  • Regulatory and safety documentation -- retrieve InChI, InChIKey, SMILES, and molecular formula for compound identification in regulatory filings.
  • Academic research -- access compound properties for computational chemistry, machine learning training data, or literature-related compound lookups.
  • Pharmaceutical market intelligence -- look up drug compound structures and properties for competitive analysis or formulation research.
  • Educational tools -- build chemistry reference tools that surface structured property data for any compound by name.

How to Use

Set the input fields and run the actor. Results are pushed to the Apify dataset and can be exported as JSON, CSV, or Excel.

Input Parameters

ParameterTypeDefaultDescription
searchTypestringnamename (compound name), formula (molecular formula), smiles (SMILES notation), or cid (PubChem CID)
querystringSearch term. Required for all search types
propertiesstringallProperty set to retrieve: basic, physical, or all
maxResultsinteger10Maximum compounds to return (1-200)

Example -- Look Up a Drug by Name

{
"searchType":"name",
"query":"aspirin",
"properties":"all",
"maxResults":1
}

Example -- Search by Molecular Formula

{
"searchType":"formula",
"query":"C9H8O4",
"properties":"physical",
"maxResults":10
}

Example -- Search by SMILES

{
"searchType":"smiles",
"query":"CC(=O)OC1=CC=CC=C1C(O)=O",
"properties":"basic",
"maxResults":5
}

Example -- Direct CID Lookup

{
"searchType":"cid",
"query":"2244",
"properties":"all",
"maxResults":1
}

Cost

  • Actor start fee: ~$0.10 per run
  • Compute: minimal -- typical runs complete in seconds
  • Data cost: $0.005 per result

Most lookups cost under $0.15 total.


Output Formats

Results are available in the Apify dataset viewer and can be exported as:

  • JSON
  • CSV
  • Excel (XLSX)
  • XML
  • RSS

FAQ

Do I need an NIH or PubChem account? No. PubChem's PUG REST API is fully public and requires no authentication.

How many compounds does PubChem contain? PubChem contains over 115 million unique compounds as of 2024. It is the largest publicly accessible chemical database in the world.

What is an InChIKey? The InChIKey is a fixed-length, hashed representation of the full InChI identifier. It is widely used as a stable, searchable identifier for chemical compounds across databases and publications.

What is XLogP? XLogP is the calculated octanol-water partition coefficient, a key measure of lipophilicity used in drug discovery and ADMET property prediction. Higher values indicate greater lipophilicity.

What is TPSA? Topological Polar Surface Area is the sum of the surface area contributed by polar atoms. It is used to predict intestinal absorption, blood-brain barrier penetration, and other pharmacokinetic properties.

Can I search for drugs by brand name? Yes. PubChem includes synonyms for many compounds including brand names, generic names, and chemical names. Searching by brand name (e.g., Tylenol, Lipitor) will return the corresponding compound record.

What is the maximum number of results per run? The maximum is 200 compounds per run. For name-based searches PubChem typically returns the closest matching compound first.

You might also like

PubChem Compound Lookup โ€” Chemistry API for Pharma R&D

azureblue/pubchem-compound-scraper

Look up chemical compounds in PubChem by name. Returns CID, molecular formula, weight, SMILES, InChI, IUPAC name, physicochemical properties, description and synonyms.

PubChem Chemical Compound Scraper

crawlerbros/pubchem-chemical-compound-scraper

Search PubChem - the world's largest free chemistry database with 100M+ compounds. Search by name, get by CID, or fetch synonyms. Returns molecular formula, weight, SMILES, InChI, logP, H-bond counts, and more. No API key required.

PubChem Compound Scraper

crawlerbros/pubchem-scraper

Scrape PubChem - the world's largest free chemistry database with 100M+ compounds. Search by name, CID, SMILES, or full-text. Returns molecular formula, weight, SMILES, InChI, logP, H-bond counts, synonyms, and more.

PubChem Compound Scraper

crawlergang/pubchem-scraper

Scrape PubChem - the world's largest free chemistry database with 100M+ compounds. Search by name, CID, SMILES, or full-text. Returns molecular formula, weight, SMILES, InChI, logP, H-bond counts, synonyms, and more.

1

5.0

PubChem Compound Scraper

parseforge/pubchem-compound-scraper

Export chemical compound data from PubChem, the world's largest open chemistry database with 119M+ compounds. Look up by CID, name, SMILES, or InChIKey. Pull molecular formulas, weights, structures, synonyms, IUPAC names, and properties.

MyChem.info Drug Annotation Scraper

parseforge/mychem-drug-annotation-scraper

Resolve any drug name or InChIKey into a tidy annotation from MyChem.info. Returns DrugBank name and accession, ChEMBL and PubChem ids, UNII, ATC codes, chemical formula, molecular weight, indications, and mechanism classes. Great for drug reference tables and identifier crosswalks.

RxNorm Drug Database Scraper

crawlergang/rx-norm-scraper

Scrape the NIH RxNorm drug database with search drugs by name, look up by RxCUI or NDC code, get brand/generic names, drug types, and spelling suggestions. Free NIH public API, no auth required.

2

5.0