VOOZH about

URL: https://apify.com/parseforge/mychem-drug-annotation-scraper

โ‡ฑ MyChem.info Drug Annotation Scraper ยท Apify


๐Ÿ‘ MyChem.info Drug Annotation Scraper avatar

MyChem.info Drug Annotation Scraper

Pricing

from $2.00 / 1,000 results

Go to Apify Store

MyChem.info Drug Annotation Scraper

Resolve any drug name or InChIKey into a tidy annotation from MyChem.info. Returns DrugBank name and accession, ChEMBL and PubChem ids, UNII, ATC codes, chemical formula, molecular weight, indications, and mechanism classes. Great for drug reference tables and identifier crosswalks.

Pricing

from $2.00 / 1,000 results

Rating

0.0

(0)

Developer

๐Ÿ‘ ParseForge

ParseForge

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

22 days ago

Last modified

Share

๐Ÿ‘ ParseForge Banner

๐Ÿ’Š MyChem.info Drug Annotation Scraper

๐Ÿš€ Turn any drug name into a clean annotation record in seconds. Resolve imatinib, aspirin, or a whole therapeutic area into DrugBank, ChEMBL, PubChem, UNII, ATC, formula, weight, indications, and mechanism, all from one keyless source.

๐Ÿ•’ Last updated: 2026-06-05 ยท ๐Ÿ“Š 23 fields per record ยท keyless public API ยท global drug and compound coverage

MyChem.info is the BioThings drug and chemical knowledge hub that aggregates DrugBank, ChEMBL, PubChem, DrugCentral, UNII, and more behind a single InChIKey-keyed API. This Actor queries MyChem.info, resolves each drug name or InChIKey to its annotation document, and returns a curated subset of the most useful fields instead of the full nested blob.

Coverage spans small molecules, biologics, and investigational compounds that carry a DrugBank annotation. You can look up drugs one by one, paste a batch of names or InChIKeys, or run a free-text query like a therapeutic area to pull back the best-matching annotated compounds.

๐ŸŽฏ Target Audience๐Ÿ’ก Primary Use Cases
Pharma and biotech researchersBuild a drug reference table across DrugBank, ChEMBL, and PubChem
Data scientists and bioinformaticiansMap drug names to standardized identifiers and ATC codes
Clinical and regulatory teamsPull indications, mechanism classes, and approval status
Chemistry and cheminformatics teamsCollect formula, molecular weight, SMILES, and CAS numbers

๐Ÿ“‹ What the MyChem.info Drug Annotation Scraper does

  • Resolves drug names (for example imatinib, metformin) or InChIKeys to their MyChem.info annotation.
  • Runs a free-text search query and returns the best-matching drug-annotated compounds.
  • Maps a curated subset of fields from DrugBank, ChEMBL, PubChem, DrugCentral, and UNII.
  • Returns standardized identifiers, ATC codes, chemical properties, indications, and mechanism classes.
  • Combines a search query and an explicit drug list in a single run when you want both.

๐ŸŽฌ Full Demo (๐Ÿšง Coming soon)

โš™๏ธ Input

FieldTypeRequiredDescription
searchQuerystringone of twoA free-text term such as a drug name or therapeutic area (for example leukemia, kinase).
drugListarrayone of twoDrug names or InChIKeys to resolve, one per entry.
maxItemsintegernoCap on how many records are produced. Free plan is limited to 10.

Provide a searchQuery, a drugList, or both. At least one is required.

{
"drugList":["imatinib","dasatinib","nilotinib","aspirin","metformin"],
"maxItems":10
}
{
"searchQuery":"leukemia",
"maxItems":25
}

โš ๏ธ Good to Know: Only compounds carrying a DrugBank annotation are returned, so very obscure or purely chemical entries may not resolve. A few fields such as ATC codes, PubChem CID, or approval year can be absent for a given drug when the upstream source has no value, and those come back as null rather than being faked.

๐Ÿ“Š Output

Each record is one drug or compound annotation.

๐Ÿท FieldDescription
๐Ÿ’Š drugNamePrimary DrugBank name
๐Ÿงฌ inchikeyInChIKey, the MyChem.info record id
๐Ÿ†” drugbankIdDrugBank accession (for example DB00619)
๐Ÿ“š drugbankAccessionsAll DrugBank accession numbers
๐Ÿงช chemblIdChEMBL molecule id
๐Ÿ”ฌ pubchemCidPubChem compound id
๐Ÿ”– uniiFDA UNII code
๐Ÿงพ casNumberCAS registry number
๐Ÿท atcCodesWHO ATC classification codes
๐Ÿงซ moleculeTypeMolecule type, for example Small molecule
๐Ÿ“ˆ maxPhaseHighest development phase reached
๐Ÿ“… firstApprovalYearYear of first approval
๐Ÿ—‚ drugGroupsStatus and route flags, for example approved, oral
โš—๏ธ molecularFormulaChemical formula
โš–๏ธ molecularWeightMolecular weight
๐Ÿงฉ smilesSMILES structure string
๐Ÿฉบ primaryIndicationLead indication
๐Ÿ“‹ indicationsIndication list
๐Ÿ”ฌ mechanismMechanism and pharmacology classes
๐ŸŒ sourceWhich input produced the record
๐Ÿ”— urlMyChem.info annotation endpoint
๐Ÿ•’ scrapedAtTimestamp of collection
โŒ errorError message, null on success

Real sample records from a live run:

{
"inchikey":"KTUFNOKKBVMGRW-UHFFFAOYSA-N",
"drugName":"Imatinib",
"drugbankId":"DB00619",
"chemblId":"CHEMBL941",
"pubchemCid":"5291",
"unii":"BKJ8M8G5HI",
"casNumber":"152459-95-5",
"atcCodes":["L01EA01"],
"moleculeType":"Small molecule",
"maxPhase":4,
"firstApprovalYear":2001,
"drugGroups":["approved","oral"],
"molecularFormula":"C29H31N7O",
"molecularWeight":493.6,
"primaryIndication":"Chronic Myelocytic Leukemia Accelerated Phase",
"mechanism":["Kinase Inhibitor","tyrosine kinase inhibitors","Protein Kinase Inhibitors"],
"url":"https://mychem.info/v1/chem/KTUFNOKKBVMGRW-UHFFFAOYSA-N",
"error":null
}
{
"inchikey":"ZBNZXTGUTAYRHI-UHFFFAOYSA-N",
"drugName":"Dasatinib",
"drugbankId":"DB01254",
"chemblId":"CHEMBL1421",
"pubchemCid":"3062316",
"unii":"X78UG0A0RN",
"casNumber":"302962-49-8",
"atcCodes":["L01EA02"],
"moleculeType":"Small molecule",
"maxPhase":4,
"firstApprovalYear":2006,
"drugGroups":["approved","oral"],
"molecularFormula":"C22H26ClN7O2S",
"molecularWeight":488,
"primaryIndication":"Philadelphia Chromosome Positive Chronic Myelocytic Leukemia",
"mechanism":["tyrosine kinase inhibitors","Protein Kinase Inhibitors"],
"url":"https://mychem.info/v1/chem/ZBNZXTGUTAYRHI-UHFFFAOYSA-N",
"error":null
}
{
"inchikey":"HHZIURLSWUIHRB-UHFFFAOYSA-N",
"drugName":"Nilotinib",
"drugbankId":"DB04868",
"chemblId":"CHEMBL255863",
"pubchemCid":"644241",
"unii":"F41401512X",
"casNumber":"641571-10-0",
"atcCodes":["L01EA03"],
"moleculeType":"Small molecule",
"maxPhase":4,
"firstApprovalYear":2007,
"drugGroups":["approved","oral"],
"molecularFormula":"C28H22F3N7O",
"molecularWeight":529.5,
"primaryIndication":"Chronic Myelocytic Leukemia Accelerated Phase",
"mechanism":["Kinase Inhibitor","tyrosine kinase inhibitors"],
"url":"https://mychem.info/v1/chem/HHZIURLSWUIHRB-UHFFFAOYSA-N",
"error":null
}

โœจ Why choose this Actor

  • One curated record instead of a deeply nested aggregation blob.
  • Cross-references DrugBank, ChEMBL, PubChem, DrugCentral, and UNII in a single row.
  • Accepts names, InChIKeys, and free-text queries interchangeably.
  • Keyless public source, so no API account or token juggling on the source side.
  • Null is honest, never invented, so your downstream joins stay clean.

๐Ÿ“ˆ How it compares to alternatives

ApproachIdentifiersIndications and mechanismSetup
This ActorDrugBank, ChEMBL, PubChem, UNII, CAS, ATC in one rowIncludedPaste names and run
Raw MyChem.info APIAvailable but deeply nestedBuried in nested blobsWrite your own parser
Manual DrugBank lookupOne source at a timePartialSlow and manual

๐Ÿš€ How to use

  1. Sign up for a free Apify account using this link.
  2. Open the MyChem.info Drug Annotation Scraper.
  3. Enter a searchQuery, a drugList of names or InChIKeys, or both.
  4. Set maxItems if you want to cap the run, then start the Actor.
  5. Collect your results from the dataset once the run finishes.

๐Ÿ’ผ Business use cases

Pharma competitive intelligence

GoalHow this helps
Track approved drug classesPull ATC codes and approval years across a target list
Benchmark mechanismsCompare mechanism classes for a therapeutic area

Data engineering and reference data

GoalHow this helps
Build an identifier crosswalkMap names to DrugBank, ChEMBL, PubChem, and UNII
Enrich an existing catalogAdd formula, weight, and CAS to your records

Clinical and regulatory research

GoalHow this helps
Review indicationsRead primary and full indication lists per drug
Check development statusUse max phase and approval year as filters

Cheminformatics

GoalHow this helps
Seed a structure datasetCollect SMILES and molecular properties
Standardize compound namesResolve free text to canonical InChIKeys

๐Ÿ”Œ Automating MyChem.info Drug Annotation Scraper

Connect runs to the tools your team already uses:

  • Make and Zapier to trigger runs and route records into other apps.
  • Slack to post a summary when a run finishes.
  • Airbyte to load results into a warehouse.
  • GitHub Actions to schedule recurring pulls.
  • Google Drive to archive each run output for your team.

๐ŸŒŸ Beyond business use cases

  • Research: assemble a tidy drug reference table for a literature review.
  • Personal: look up the formula, weight, and class of a medication you are curious about.
  • Non-profit: support patient education resources with standardized drug facts.
  • Experimentation: prototype a chatbot that answers questions about drug identifiers.

๐Ÿค– Ask an AI assistant

Paste your results into ChatGPT, Claude, Perplexity, or Copilot and ask it to summarize mechanisms, group drugs by ATC class, or spot gaps in your reference table.

โ“ Frequently Asked Questions

Is MyChem.info free to query? Yes, MyChem.info is a keyless public BioThings API. This Actor adds resolution, curation, and clean output on top.

What can I put in the drug list? Drug names such as imatinib or metformin, or InChIKeys such as KTUFNOKKBVMGRW-UHFFFAOYSA-N. Both are accepted in the same list.

What does the search query return? It returns the best-matching compounds that carry a DrugBank annotation, so you get usable drug records rather than packaging entries.

Why is a field sometimes null? The upstream source had no value for that drug. Nulls are kept as null and never invented.

Which sources are combined? DrugBank, ChEMBL, PubChem, DrugCentral, and UNII, all keyed by InChIKey inside MyChem.info.

How is a name resolved to a record? The Actor searches MyChem.info for the name among drug-annotated compounds and selects the top match.

Can I pull a whole therapeutic area? Yes, use a search query like leukemia or kinase and raise maxItems to collect more compounds.

Does it return chemical structure? Yes, each record includes a SMILES string plus molecular formula and weight when available.

What is the InChIKey used for? It is the stable record id in MyChem.info and a portable key for joining across chemical datasets.

How many records can I get? Free plan runs are limited to 10. Paid plans can collect up to 1,000,000.

๐Ÿ”Œ Integrate with any app

Every run writes to a structured dataset you can pull through the Apify API or connect to your stack with the integrations above.

๐Ÿ”— Recommended Actors

๐Ÿ’ก Pro Tip: browse the complete ParseForge collection.

๐Ÿ†˜ Need Help? Open our contact form

โš ๏ธ Disclaimer: independent tool, not affiliated with MyChem.info or BioThings. Only publicly available data collected.

You might also like

PubChem Compound Scraper - Chemical & Drug Data API

pink_comic/pubchem-compound-search

Scrape NIH PubChem chemical compound data by name, formula, SMILES, or CID. Get molecular weight, IUPAC, InChI, SMILES, XLogP, synonyms, and drug data for pharma, toxicology, and R&D workflows.

FDA Drug Labels Scraper

labrat011/fda-drug-labels-scraper

Extract FDA drug label data -- indications, dosages, warnings, black box alerts, drug interactions, and more. Search by drug name, active ingredient, manufacturer, or browse the full openFDA database. No API key required.

FDA Drug Recall Search

ryanclinton/fda-drug-recalls

FDA Drug Recall Search queries the U.S. Food and Drug Administration's openFDA drug enforcement endpoint to retrieve detailed, structured data about pharmaceutical drug recalls.

openFDA Drug Events & Recalls Scraper

scrapers_lat/openfda-drug-events-scraper

Scrape FDA drug adverse event reports (FAERS) and drug recall and enforcement actions from the official openFDA API. Search by drug, reason or date. Export to JSON, CSV, Excel.

2

5.0

Medicaid Drug Formulary Scraper

parseforge/medicaid-formulary-scraper

Export state Medicaid drug formulary and preferred drug list records via the public CMS Drug Pricing data sets: drug name, NDC, package size, AMP, manufacturer and reporting period. Power pharma market access and reimbursement research. CSV, Excel, JSON, XML.