VOOZH about

URL: https://glama.ai/mcp/servers/search/real-time-web-scraping-tools

⇱ Real-time web scraping tools | Glama


Search for:

Real-time web scraping tools

View all MCP Servers

  • Why this server?

    This server is designed explicitly to 'scrape and extract data from any website globally,' bypassing anti-bot systems and handling JavaScript, making it an excellent 'real-time crawler tool'.

    -
    license
    -
    quality
    -
    maintenance
    Enables AI models to scrape and extract data from any website globally using Thordata's 195+ country proxy network. Bypasses anti-bot systems and renders JavaScript content, outputting structured data in Markdown, HTML, or Links format.
    Last updated
  • Why this server?

    Directly enables 'web scraping and crawling capabilities for LLM clients,' covering both single-page scraping and multi-page website crawling, which fits the definition of a comprehensive crawler tool.

    A
    license
    -
    quality
    C
    maintenance
    Enables web scraping and crawling capabilities for LLM clients, supporting single-page scraping, multi-page website crawling, and web search with multiple engines (Playwright, Cheerio, Puppeteer) and flexible output formats including markdown, HTML, text, and screenshots.
    Last updated
    11
    6
    MIT
  • Why this server?

    This server is a 'powerful web scraping MCP server' built on the well-known Scrapy framework, indicating professional-grade crawling capabilities.

    A
    license
    A
    quality
    F
    maintenance
    A powerful web scraping MCP server built on Scrapy and FastMCP that supports multiple scraping methods (HTTP, Scrapy, browser automation), anti-detection techniques, form handling, and concurrent crawling. Designed for commercial environments with enterprise-grade features like intelligent retry mechanisms, performance monitoring, and configurable data extraction.
    Last updated
    10
    3
    MIT
  • Why this server?

    Provides AI-powered web scraping, crawling, and structured data extraction, making it a highly intelligent and comprehensive real-time crawler.

  • Why this server?

    Explicitly defined as a 'web scraping server' supporting multiple formats and content extraction rules, which directly addresses the user's need for a crawling tool.

    A
    license
    A
    quality
    C
    maintenance
    A TypeScript-based web scraping server built on the Model Context Protocol that offers multiple export formats, content extraction rules, and support for both static and dynamic (SPA) websites.
    Last updated
    7
    12
    1
    MIT
  • Why this server?

    Offers tools to 'scrape and extract data' including web search, web page extraction, and screenshot capture, essential functions for a real-time web crawler tool.

    A
    license
    A
    quality
    C
    maintenance
    Enables web content extraction, screenshot capture, web search, arXiv paper search, and image search through Jina AI's APIs. Provides tools for reading URLs as markdown, searching the web for current information, and finding academic papers or images.
    Last updated
    19
    717
    Apache 2.0
  • Why this server?

    Enables browser automation and web page interactions, a core technology used for real-time, dynamic web scraping and data extraction.

    A
    license
    -
    quality
    D
    maintenance
    Enables LLMs to perform browser automation and web page interactions using Playwright's accessibility tree instead of screenshots. Provides fast, deterministic web automation through structured data without requiring vision models.
    Last updated
    5,659,017
    Apache 2.0
  • Why this server?

    This server focuses on 'web scraping and crawling' data from websites affected by bot detection, making it suitable for acquiring real-time data from challenging sources.

    A
    license
    A
    quality
    A
    maintenance
    A server that enables web scraping of difficult-to-access websites affected by bot detection, captchas, or geolocation restrictions, returning results in either HTML or Markdown format.
    Last updated
    4
    2
    75
    18
    MIT
  • Why this server?

    Explicitly covers 'web searching and webpage scraping' using crawler technology, making it a fitting tool for gathering web data.

    A
    license
    B
    quality
    D
    maintenance
    Enables web searching and webpage scraping using pure crawler technology without requiring official APIs. Supports Bing web and news search, batch webpage scraping, and content extraction through Puppeteer automation.
    Last updated
    4
    1
    MIT