Markdownify MCP Server

Pricing

from $1.00 / 1,000 results

Markdownify MCP Server

Convert any webpage to clean, formatted Markdown perfect for AI consumption. Ideal for building knowledge bases, documentation scrapers, and content migration tools.

Pricing

from $1.00 / 1,000 results

Rating

5.0

(3)

Developer

👁 Crawler Bros

Crawler Bros

Maintained by Community

Actor stats

Bookmarked

Total users

Monthly active users

8 days ago

Last modified

Features

✅ Convert any webpage to Markdown - Clean, formatted output
✅ CSS Selector Support - Include/exclude specific sections
✅ JavaScript Rendering - Optional Playwright support for dynamic content
✅ Authentication Support - HTTP Basic Auth for restricted content
✅ Customizable Output - Configure heading styles, strip tags, etc.
✅ Error Handling - Graceful failures with detailed error messages
✅ MCP Server Ready - Structured output for AI consumption

How It Works

Input - Provide URL(s) and optional configuration
Fetch - Download webpage content (HTTP or Playwright)
Extract - Apply include/exclude selectors
Convert - Transform HTML to clean Markdown
Output - Save to Apify dataset with metadata

Input Parameters

Required

urls (array of strings) - List of webpage URLs to convert

Optional

includeSelectors (array of strings) - CSS selectors to include specific sections
Example: ["article", ".main-content", "#documentation"]
excludeSelectors (array of strings) - CSS selectors to exclude
Example: ["nav", "footer", ".advertisement", "script", "style"]
useJavaScript (boolean) - Enable Playwright for JavaScript-heavy pages
Default: false
headingStyle (string) - Markdown heading style
Options: "ATX" (# Heading) or "SETEXT" (Heading\n=======)
Default: "ATX"
stripTags (array of strings) - HTML tags to completely remove
Default: ["script", "style", "iframe", "noscript"]
auth (object) - HTTP Basic Authentication credentials
Example: {"username": "user", "password": "pass"}
timeout (integer) - Request timeout in seconds
Default: 30, Range: 10-120

Input Example

{
"urls":["https://apify.com/docs","https://en.wikipedia.org/wiki/Markdown"],
"excludeSelectors":["nav","footer",".advertisement"],
"useJavaScript":false,
"headingStyle":"ATX",
"timeout":30
}

Output Format

Each converted page is saved as a separate record in the dataset:

{
"url":"https://example.com",
"title":"Example Domain",
"markdown":"# Example Domain\n\nThis domain is for use...",
"markdown_length":1234,
"success":true,
"error":null,
"scraped_at":"2025-10-24T10:30:00.000Z",
"meta":{
"method":"http",
"heading_style":"ATX",
"stripped_tags":["script","style"],
"used_include_selectors":false,
"used_exclude_selectors":true
}
}

Use Cases

📚 Build AI-Ready Knowledge Bases

Convert documentation, wikis, and help centers into Markdown for AI training or RAG systems.

📝 Content Migration

Migrate existing web content to Markdown for static site generators (Jekyll, Hugo, etc.).

🤖 AI Agent Integration

Enable AI agents to consume web content in a clean, structured format.

📄 Documentation Scraping

Extract and format technical documentation from multiple sources.

🔄 Content Synchronization

Keep Markdown versions of web pages up-to-date automatically.

API Integration

JavaScript/Node.js

const{ ApifyClient }=require("apify-client");
const client =newApifyClient({token:"YOUR_API_TOKEN"});
const input ={
urls:["https://example.com"],
excludeSelectors:["nav","footer"],
};
const run =await client.actor("YOUR_ACTOR_ID").call(input);
const{ items }=await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item)=>{
 console.log(`Title: ${item.title}`);
 console.log(`Markdown length: ${item.markdown_length}`);
 console.log(item.markdown);
});

Python

from apify_client import ApifyClient
client = ApifyClient('YOUR_API_TOKEN')
input_data ={
'urls':['https://example.com'],
'excludeSelectors':['nav','footer']
}
run = client.actor('YOUR_ACTOR_ID').call(run_input=input_data)
for item in client.dataset(run['defaultDatasetId']).iterate_items():
print(f"Title: {item['title']}")
print(f"Markdown length: {item['markdown_length']}")
print(item['markdown'])

cURL

curl-X POST https://api.apify.com/v2/acts/YOUR_ACTOR_ID/runs \
-H"Authorization: Bearer YOUR_API_TOKEN"\
-H"Content-Type: application/json"\
-d'{
 "urls": ["https://example.com"],
 "excludeSelectors": ["nav", "footer"]
 }'

Tips & Best Practices

🚀 Performance

Use useJavaScript: false for static pages (much faster)
Only enable useJavaScript: true for dynamic content
Use includeSelectors to extract only what you need
Batch multiple URLs in a single run

🎯 Accuracy

Test selectors in browser DevTools first
Use specific includeSelectors for precise extraction
Combine include and exclude for best results
Add common noise elements to excludeSelectors

🔧 Troubleshooting

Empty markdown? Check if selectors are correct
Missing content? Try enabling useJavaScript
Timeout errors? Increase timeout value
Authentication issues? Verify auth credentials

Development

Local Testing

# Install dependencies
pip install-r requirements.txt
# Install Playwright browsers
playwright install chromium
# Run locally
python -m src

Project Structure

markdownify-mcp/
├── .actor/
│ ├── actor.json # Actor configuration
│ ├── input_schema.json # Input validation
│ └── output_schema.json # Output structure
├── src/
│ ├── __main__.py # Main entry point
│ ├── fetcher.py # HTTP& Playwright fetchers
│ ├── extractor.py # Content extraction
│ └── converter.py # HTML to Markdown
├── Dockerfile # Docker configuration
├── requirements.txt # Python dependencies
└── README.md # This file

License

Apache 2.0

Support

For issues, questions, or feature requests, please contact support or open an issue in the repository.

Made with ❤️ for the AI community

👁 File to Markdown avatar

File to Markdown

shahidirfan/file-to-markdown

Transform files into clean, readable Markdown instantly. Convert PDFs, documents, images, and more to structured Markdown format. Perfect for automating documentation workflows, content migration, and building knowledge bases. Ideal for developers, writers, and content teams.

👁 User avatar

Shahid Irfan

5.0

Website to Markdown MCP Server

quodlibetical_buffalo/website-to-markdown-mcp

Convert any webpage to clean Markdown. MCP server for AI agents and LLM pipelines.

👁 User avatar

Marek Pommier

PDF to Markdown Converter - Extract & Format Text

ntriqpro/pdf-to-markdown

Convert PDF documents to clean, readable markdown format. Perfect for documentation and knowledge bases.

👁 User avatar

daehwan kim

Webpage To Clean Markdown

technicaldost/webpage-to-clean-markdown

👁 User avatar

Technical Dost Solutions

👁 HTML to Markdown avatar

HTML to Markdown

web.harvester/html-to-markdown

Convert HTML to clean Markdown. Supports GFM tables, code blocks, and custom rules. Perfect for content migration and documentation.

👁 User avatar

Web Harvester

👁 Website To Markdown avatar

Website To Markdown

swarmgarden/website-to-markdown

Convert any webpage to clean, readable Markdown format. Perfect for content extraction and readability.

👁 User avatar

Swarm Garden

👁 Website To Markdown avatar

Website To Markdown

smart_api/website-to-markdown

Convert any webpage into clean, LLM-ready Markdown in seconds — perfect for AI training data, RAG pipelines, and content archiving.

👁 User avatar

SmartApi

5.0

AI Web Content Crawler - Markdown for LLMs

intelscrape/ai-web-content-crawler

Crawl any website and extract clean Markdown optimized for LLM training, RAG pipelines, and AI knowledge bases - removes boilerplate and outputs structured JSON with URL, title, markdown, and metadata.

👁 User avatar

IntelScrape

👁 Ai Ready Web Page To Markdown Converter avatar

Ai Ready Web Page To Markdown Converter

mustafa.irshaid.113/ai-ready-web-page-to-markdown-converter

Convert any webpage into structured Markdown and HTML using just a URL. Get the page title, link, and content—perfect for SEO, devs, and AI crawlers. Fast, clean, and ideal for repurposing or analysis. Start turning websites into Markdown instantly.

👁 User avatar

Mustafa Irshaid

👁 HTML to Markdown Converter - Bulk Web Content to MD avatar

HTML to Markdown Converter - Bulk Web Content to MD

santamaria-automations/html-to-markdown

Extract main article content from any website and convert to clean Markdown including headings, links, images, tables, and code blocks. Perfect for LLM training, AI pipelines, and documentation. Export data, run via API, schedule and monitor runs, or integrate with other tools.

👁 User avatar

Ale

URL: https://apify.com/crawlerbros/markdownify-mcp-server