VOOZH about

URL: https://apify.com/jungle_synthesizer/dlmf-nist-math-functions-scraper

โ‡ฑ DLMF NIST Math Functions Scraper ยท Apify


Pricing

Pay per event

Go to Apify Store

DLMF NIST Math Functions Scraper

Scrapes the NIST Digital Library of Mathematical Functions (DLMF) for structured equation data: MathML, LaTeX source, constraints, and referenced functions โ€” across all 36 chapters and hundreds of sections.

Pricing

Pay per event

Rating

0.0

(0)

Developer

๐Ÿ‘ BowTiedRaccoon

BowTiedRaccoon

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

18 days ago

Last modified

Share

Scrapes the NIST Digital Library of Mathematical Functions (DLMF) โ€” the authoritative reference for special functions in mathematics and physics โ€” to produce a structured, machine-readable corpus of numbered equations with MathML, LaTeX source, and associated metadata.

What it does

The actor performs a three-level hierarchical crawl:

  1. Index โ€” discovers all 36 DLMF chapters from the homepage
  2. Chapter pages โ€” discovers all sections within each chapter (typically 10โ€“20 sections)
  3. Section pages โ€” extracts every numbered equation including MathML, LaTeX TeX source, plain-text rendering, referenced symbols, and the canonical permalink

Across all 36 chapters the DLMF contains approximately 5,000โ€“10,000 numbered equations. A full crawl completes in minutes at the default concurrency.

Output fields

FieldDescription
chapterChapter number (integer, 1โ€“36)
sectionSection identifier, e.g. 1.2
titleSection title, e.g. Elementary Algebra
equation_numberDLMF equation number, e.g. 1.2.1
equation_mathmlFull MathML XML for the equation
equation_texLaTeX source recovered from MathML alttext attribute
equation_textUnicode plain-text rendering of the equation
constraintsConstraint text associated with the equation (if any)
referenced_functionsPipe-separated list of symbol/function names referenced
urlCanonical DLMF permalink, e.g. http://dlmf.nist.gov/1.2.E1

Use cases

  • Symbolic math / CAS training data โ€” verified special-function formulas with LaTeX and MathML
  • RAG / vector search corpora โ€” ground-truth equation database for scientific-computing AI agents
  • Formula search engines โ€” structured index of equations by chapter/section with canonical IDs
  • Verification datasets โ€” NIST-authoritative identities for function evaluations

Input parameters

ParameterTypeDefaultDescription
maxItemsinteger10Maximum number of equations to scrape (0 = unlimited)
startChapterinteger1First chapter to crawl (1โ€“36)
endChapterinteger36Last chapter to crawl (1โ€“36, omit for all)

Example output record

{
"chapter":1,
"section":"1.2",
"title":"Elementary Algebra",
"equation_number":"1.2.1",
"equation_tex":"\\genfrac{(}{)}{0.0pt}{}{n}{k}=\\frac{n!}{(n-k)!k!}",
"equation_text":"(nk)=n!/(nโˆ’k)!โขk!",
"constraints":"",
"referenced_functions":"(mn): binomial coefficient | !: factorial (as in n!) | n: nonnegative integer",
"url":"http://dlmf.nist.gov/1.2.E1"
}

Notes

  • The DLMF is a US government publication (NIST). Content is in the public domain.
  • No proxy required โ€” dlmf.nist.gov is a clean US gov host with no anti-bot measures.
  • Chapter 1 alone contains ~180 equations across 18 sections. Full 36-chapter run yields ~5,000+ records.
  • Sections containing only notation tables (no numbered equations) return 0 results โ€” this is expected.

You might also like

Deterministic Math Expression Evaluator Tool

rainminer/math-tool

Evaluate mathematical expressions deterministically for AI agents and automations. Supports arithmetic, functions, and variables via a safe expression parser. Designed for AI agents for concise, deterministic, and accurate math outputs. Use over MCP, API, or the Apify Console.

Vimeo & Dailymotion Toolkit

moving_beacon-owner1/vimeo-dailymotion-toolkit

Scrape metadata, extract streams, search videos, crawl channels, and download from Vimeo and Dailymotion. All public functions work without API keys.

5

LinkedIn Company Hiring Tracker

coregent/linkedin-company-hiring-tracker

Track open LinkedIn jobs for a list of companies and get one clean hiring-signal row per company - open roles, top functions, locations, seniority mix, remote/hybrid count, and change since last run. No LinkedIn login or cookies.

2

General Purpose Web Scraping and Metadata Extraction

moving_beacon-owner1/my-actor-10

This project uses the Apify platform to scrape data from web pages, collect metadata, and store results in an Apify dataset. It features functions for managing date ranges, encoding identifiers, and handling large datasets, aiming to efficiently extract and store structured data for analysis.

14

Era Immobilier Scraper

corent1robert/era-immobilier-scraper

Extract comprehensive data from ERA Immobilier agencies including agency information and agent details. Scrapes all agencies from the sitemap and extracts agent data (names, functions, emails, phones) from embedded text. Optimized for fast extraction with retry mechanism and batch processing.

๐Ÿ‘ User avatar

Corentin Robert

6

Arc Civic

royal_xenomorph/arc-civic

Arc Civic functions as a Digital Analyst, autonomously filtering through hours of public proceedings to extract the specific zoning permits, municipal funding votes, and regulatory shifts that matter to your business

Orpi Scraper

corent1robert/orpi-scraper

Extract comprehensive data from Orpi agencies including agency information and advisor details. Scrapes all agencies from the sitemap and extracts advisor data (names, functions, phones, photos, agentIds) from HTML. Optimized for fast extraction with retry mechanism and batch processing.

๐Ÿ‘ User avatar

Corentin Robert

2

CVE Vulnerability Lookup (NIST NVD)

automation-lab/cve-vulnerability-lookup

Query the NIST NVD for CVE details โ€” lookup by CVE ID, keyword, or CPE product. Returns CVSS scores, descriptions, CWE IDs, affected software, and patch links. No API key required.

๐Ÿ‘ User avatar

Stas Persiianenko

4

Glassdoor Jobs Scraper

burbn/glassdoor-jobs-scraper

Scrape Glassdoor company job listings by company ID or name. Extract job titles, salaries, locations, easy apply status, job functions, and more. Filter by job function, location, recency, and sort order. Perfect for job market research and recruitment intelligence.

LatexConvert Image to LaTeX API

latexconvert/image-to-latex-api

Convert formula images, screenshots, and equation URLs into clean LaTeX with the LatexConvert API. Provided by latexconvert.com

Related articles

Data provenance: how to apply it to scraped data
Read more