VOOZH about

URL: https://apify.com/gentle_cloud/pandoc-document-converter

⇱ Pandoc Document Converter Β· Apify


Pricing

$0.01 / actor start

Go to Apify Store

Pandoc Document Converter

Convert documents between formats (HTML, Markdown, DOCX, EPUB, PDF, LaTeX, RST, ODT, PPTX) using Pandoc. Accepts raw text or URL input.

Pricing

$0.01 / actor start

Rating

0.0

(0)

Developer

πŸ‘ Monkey Coder

Monkey Coder

Maintained by Community

Actor stats

1

Bookmarked

16

Total users

6

Monthly active users

21 days ago

Last modified

Share

πŸ“„ Pandoc Document Converter

Convert documents between multiple formats using the powerful Pandoc document conversion engine. Supports HTML, Markdown, DOCX, EPUB, PDF, LaTeX, RST, ODT, PPTX, and more.

✨ Features

  • 20+ format support β€” Convert between HTML, Markdown, GFM, CommonMark, LaTeX, RST, DOCX, EPUB, ODT, PPTX, PDF, plain text, AsciiDoc, MediaWiki, Org-mode, and more
  • URL input β€” Fetch content directly from a URL and convert it
  • Raw text input β€” Paste HTML, Markdown, or any supported format directly
  • Binary output β€” DOCX, EPUB, ODT, PPTX, and PDF files are saved to the key-value store for easy download
  • PDF generation β€” Powered by WeasyPrint (no heavy LaTeX installation needed)
  • Standalone mode β€” Produce complete documents with proper headers and footers

πŸ”§ How It Works

  1. You provide content (raw text or a URL to fetch from)
  2. You specify the input format and desired output format
  3. The Actor runs Pandoc CLI to perform the conversion
  4. Text output (HTML, Markdown, etc.) is returned in the dataset
  5. Binary output (DOCX, EPUB, PDF, etc.) is saved to the key-value store and base64-encoded in the dataset

πŸš€ How to Use

  1. Set input β€” Either paste content in the "Content" field or enter a URL in "Source URL"
  2. Choose formats β€” Set "Input Format" (e.g., html) and "Output Format" (e.g., markdown)
  3. Run the Actor
  4. Get results β€” Check the dataset for text output, or download binary files from the key-value store

Common Conversions

FromToUse Case
htmlmarkdownConvert web pages to Markdown
markdownhtmlRender Markdown as HTML
htmldocxSave web content as Word document
markdowndocxCreate Word documents from Markdown
htmlepubConvert articles to e-book format
markdownpdfGenerate PDF from Markdown
htmlplainStrip HTML tags, extract plain text
latexhtmlConvert LaTeX papers to web format
htmlrstConvert to reStructuredText

πŸ“Š Sample Output (text conversion)

{
"from_format":"html",
"to_format":"markdown",
"input_size_bytes":245,
"output_size_bytes":128,
"output_type":"text",
"output":"# Hello World\n\nThis is a **sample HTML** document for conversion.\n\n- Item 1\n- Item 2\n- Item 3\n",
"converted_at":"2026-03-20T08:30:00.000000"
}

πŸ“Š Sample Output (binary conversion)

{
"from_format":"html",
"to_format":"docx",
"input_size_bytes":245,
"output_size_bytes":8432,
"output_type":"binary",
"output_base64":"UEsDBBQAAAAI...",
"download_key":"output.docx",
"converted_at":"2026-03-20T08:30:00.000000"
}

Binary files (DOCX, EPUB, ODT, PPTX, PDF) are also saved to the key-value store with the key output.<format> for direct download.

πŸ“ Input Formats

html, markdown, gfm (GitHub Flavored Markdown), commonmark, latex, rst, textile, org, mediawiki, json (Pandoc AST)

πŸ“€ Output Formats

html, markdown, gfm, commonmark, latex, rst, plain, docx, epub, odt, pptx, asciidoc, mediawiki, org, pdf

⚠️ Notes

  • Input size limit: 10 MB maximum
  • PDF output: Uses WeasyPrint engine (supports CSS styling, no LaTeX needed)
  • Binary output: Files are base64-encoded in the dataset AND saved to the key-value store for direct download
  • URL fetching: Basic HTTP GET with browser-like User-Agent. Sites with advanced anti-bot protection may not work.
  • Memory: Recommended 1 GB for large documents or PDF generation

You might also like

Pandoc Universal Mcp

whitewalk/pandoc-universal-mcp

Convert documents between 40+ formats via MCP. Markdown, DOCX, PDF, HTML, LaTeX, EPUB, PPTX & more. Academic support with citations, bibliography & math. Batch conversion. Perfect for AI agents & Claude Desktop integration.

Pandoc Document Converter

incredible_moment/pandoc-actor

Universal document converter. Transform Markdown, HTML, and text to PDF, DOCX, EPUB, and more. High-performance Rust wrapper for the Pandoc engine ensures fast execution and low memory footprint.

10

Universal Document Format Transformer

actorify/universal-document-format-transformer

Universal Document Format Transformer: a cloud-based Apify Actor that converts documents (PDF, DOCX, PPTX, HTML, TXT) into Markdown, JSON, CSV, HTML or TXT using Pandoc. Easy REST API for automations (n8n, Zapier, Make), production-ready error handling, and security controls.

RAG Document Converter

web.harvester/rag-document-converter

Convert PDF, DOCX, PPTX, and other documents to clean Markdown optimized for RAG pipelines. Preserves structure, tables, and headers. Powered by IBM Docling.

2

PDF to MP3 - Convert PDF, EPUB, DOCX & Text to Audiobook

marielise.dev/pdf-to-mp3

Convert PDF, EPUB, DOCX, Markdown, HTML, TXT, and RTF to MP3 audiobooks. Free Microsoft Edge TTS (no API key) with OCR for scanned PDFs, 70+ languages, and optional OpenAI or ElevenLabs voices. ~$0.04/min.

HTML to PDF Converter

automation-lab/html-to-pdf-converter

Convert HTML content or web pages to PDF documents. Supports raw HTML strings, single URLs, and bulk URL lists. Full control over page size, margins, orientation, headers, and footers.

πŸ‘ User avatar

Stas Persiianenko

27