VOOZH about

URL: https://glama.ai/mcp/servers/search/mcp-server-for-finding-research-data-and-models-for-aiml-training

⇱ MCP server for finding research data and models for AI/ML training | Glama


Search for:

MCP server for finding research data and models for AI/ML training

View all MCP Servers

  • Why this server?

    This server is highly effective for gathering raw data, as it can scrape and extract structured data from any website globally, bypassing anti-bot systems. This directly fulfills the need to pull information from 'websites, articles' for training data.

    -
    license
    -
    quality
    -
    maintenance
    Enables AI models to scrape and extract data from any website globally using Thordata's 195+ country proxy network. Bypasses anti-bot systems and renders JavaScript content, outputting structured data in Markdown, HTML, or Links format.
    Last updated
  • Why this server?

    Provides a direct interface to the Kaggle API, enabling the user to search and access datasets and kernels, which are crucial sources for finding data and models mentioned in the request ('kaggle').

    A
    license
    -
    quality
    D
    maintenance
    Connects Claude AI to the Kaggle API through the Model Context Protocol, enabling users to browse competitions, search and download datasets, analyze kernels, and access pre-trained models through natural language interactions.
    Last updated
    MIT
  • Why this server?

    Allows access to the Hugging Face Hub API to retrieve information about machine learning models and datasets. This is essential for finding existing models or data resources for training AI/ML models.

    F
    license
    A
    quality
    D
    maintenance
    Enables access to the Hugging Face Hub API to search and retrieve information about machine learning models, datasets, and their metadata. Provides comprehensive tools for exploring the Hugging Face ecosystem including model details, dataset information, and parquet file access.
    Last updated
    8
  • Why this server?

    Specifically designed to search, filter, and export Software Engineering papers on arXiv, directly addressing the requirement to find information in 'research papers'.

    F
    license
    A
    quality
    D
    maintenance
    An MCP server that enables intelligent searching, filtering, and exporting of Software Engineering papers on arXiv with tools for querying by keywords, authors, analyzing trends, and finding related research.
    Last updated
    7
    6
  • Why this server?

    Enables searching and retrieving detailed information from PubMed articles using the NCBI Entrez API, providing access to biomedical 'research papers' and scientific data for LLMs.

    F
    license
    A
    quality
    D
    maintenance
    Enables searching and retrieving detailed information from PubMed articles using the NCBI Entrez API. Supports configurable search parameters including title/abstract filtering and keyword expansion to find relevant scientific publications.
    Last updated
    1
  • Why this server?

    Enables web scraping and extraction from any website globally, supporting dynamic content and outputting structured data, perfect for gathering broad information from 'websites, articles' and 'anywhere'.

    A
    license
    -
    quality
    C
    maintenance
    Enables web scraping and crawling capabilities for LLM clients, supporting single-page scraping, multi-page website crawling, and web search with multiple engines (Playwright, Cheerio, Puppeteer) and flexible output formats including markdown, HTML, text, and screenshots.
    Last updated
    11
    6
    MIT
  • Why this server?

    Facilitates comprehensive web research by leveraging Tavily's APIs to gather and structure data for high-quality markdown document creation, an excellent tool for compiling research from various 'websites' and 'articles'.

    -
    license
    B
    quality
    -
    maintenance
    A Model Context Protocol compliant server that facilitates comprehensive web research by utilizing Tavily's Search and Crawl APIs to gather and structure data for high-quality markdown document creation.
    Last updated
    1
    57
    12
  • Why this server?

    A multipurpose tool focused on Retrieval-Augmented Generation that searches, indexes, and processes documents (PDF, DOCX, etc.), ideal for handling and making sense of the raw data collected from research papers and articles for LLM consumption.

    F
    license
    A
    quality
    B
    maintenance
    Intelligent knowledge base system that enables users to process documents in 25+ formats, perform semantic search and Q\&A through vector retrieval. Supports multiple AI models including OpenAI and DouBao with local processing capabilities.
    Last updated
    10
    5
  • Why this server?

    Offers access to a vast array of public datasets, which directly addresses the need to find 'data' for training AI/ML models from diverse and accessible sources.