Voozh

Helping everyone from startups to Fortune 10 enterprises unlock their data.

Harvey
Scale AI
Newfront
Medallion
Vanta
Legora
Rogo
Levelpath
JLL
Vise
Laurel
Toast
Mercor
Zip
Anterior
Supio

👁 Harvey
👁 Scale AI
👁 Newfront
👁 Medallion
👁 Vanta
👁 Legora
👁 Rogo
👁 Levelpath
👁 JLL
👁 Vise
👁 Laurel
👁 Toast
👁 Mercor
👁 Zip
👁 Anterior
👁 Supio
👁 Harvey
👁 Scale AI
👁 Newfront
👁 Medallion
👁 Vanta
👁 Legora
👁 Rogo
👁 Levelpath
👁 JLL
👁 Vise
👁 Laurel
👁 Toast
👁 Mercor
👁 Zip
👁 Anterior
👁 Supio

Harvey
Scale AI
Newfront
Medallion
Vanta
Legora
Rogo
Levelpath
JLL
Vise
Laurel
Toast
Mercor
Zip
Anterior
Supio

+ many more

Endpoint

Use when

Output

How it works with Parse

/parseParse

Structured content from any document is needed for LLM or RAG use.

Structured chunks with typed blocks, bounding boxes, and confidence scores.

Read the Parse docs

/extractExtract

The fields to pull are defined and typed JSON is needed.

Schema-typed JSON with optional citations on every value.

Runs Parse internally and returns only schema-defined fields.

/splitSplit

One file contains multiple logical documents or sections.

Page ranges for each section, with confidence scores.

Finds section boundaries so each part can be parsed separately.

/classifyClassify

Files need to be routed by type before processing.

Best-matching category with per-criterion confidence.

A fast, lightweight step that routes files to the right pipeline before parsing.

/editEdit

A PDF form needs filling or a DOCX needs updating.

A downloadable edited file, plus a reusable form schema.

Writes data back into a document after Parse reads it.

Try out Parse in Studio or via the API.

Open Studio Request a demo

RAG over enterprise documents

Chunks split at section, table, and figure boundaries, so retrieval returns complete units of meaning instead of cut-off fragments.

Document AI agents

Give an agent a structured view of any uploaded file with bounding boxes and confidence scores.

Tables, spreadsheets, and forms

Reconstructs merged cells, nested headers, and multi-page tables. Output in HTML, Markdown, JSON, or CSV.

Scans, faxes, and photographs

Agentic OCR mode reviews and corrects faded scans, unusual fonts, and photographed pages that break traditional OCR.

Charts and figure extraction

Vision-model summaries describe figures in natural language, with optional structured data extraction for analytics.

Knowledge bases & search

Every element returns with its position on the page, so search products can link results back to the exact paragraph, row, or figure in the source document.

Try out Parse in Studio or via the API.

Open Studio Request a demo

01
Preserves the original layout
Multi-column layouts, headers, footnotes, sidebars, and multi-page tables. Reading order stays intact.
02
Citation-grounded output
Every block includes a bounding box and confidence score. Trace any output back to its exact location.
03
Agentic OCR for hard scans
A VLM review pass corrects handwriting, faded scans, unusual fonts, and misaligned columns.
04
Table fidelity that holds up
Merged cells, nested headers, multi-page tables reconstruct in HTML, Markdown, JSON, or CSV.
05
Sync and async, your call
Sync for low-latency calls, async with webhooks for batch jobs. Files up to 5GB via presigned URL. Reuse results with jobid:// to skip re-processing.

STEP 01
Send a file
Upload via /upload or pass a public or presigned URL directly. Supports PDFs, images, Office documents, and spreadsheets.
POST /parse
STEP 02
We read the page
Vision models recognize titles, paragraphs, tables, figures, headers, and footers.
vision + agentic OCR
STEP 03
We reconstruct structure
Tables, merged cells, and figures rebuild faithfully. Agentic review handles complex pages.
tables · figures · text
STEP 04
You get JSON back
Chunks with typed blocks and bounding boxes, optimized for RAG and LLM workflows.
chunks[].blocks[].bbox