VOOZH about

URL: https://apify.com/benthepythondev/arxiv-scraper

⇱ arXiv Scraper - Scientific Papers, Abstracts & PDFs Β· Apify


πŸ‘ arXiv Scraper - Scientific Papers, Abstracts & PDFs avatar

arXiv Scraper - Scientific Papers, Abstracts & PDFs

Pricing

Pay per usage

Go to Apify Store

arXiv Scraper - Scientific Papers, Abstracts & PDFs

arXiv Scraper for the official arXiv API. Search 2M+ scientific papers in CS, physics, math and biology by keyword, title, author, abstract or category. Extract title, authors, abstract, categories, DOI, dates and PDF links. For AI/ML research, literature reviews and RAG datasets.

Pricing

Pay per usage

Rating

0.0

(0)

Developer

πŸ‘ ben

ben

Maintained by Community

Actor stats

0

Bookmarked

3

Total users

1

Monthly active users

8 hours ago

Last modified

Categories

Share

arXiv Scraper β€” Scientific Papers, Abstracts & PDFs

Search arXiv.org β€” 2M+ open-access scientific papers in physics, CS, math, biology, economics and more β€” via the official arXiv API.

Built for AI/ML research, literature reviews, RAG datasets, and research analytics. Keyless, fast and reliable β€” no proxy or browser needed.

What you get

Per paper:

  • title, arxiv_id
  • authors, author_count
  • abstract (full text)
  • primary_category, categories
  • published, updated
  • doi, journal_ref, comment
  • pdf_url, abstract_url
  • scraped_at

Why this Actor

arXiv ScraperManual searchRaw arXiv API
Clean flat JSON outputYesβ€”Atom XML to parse
Search + filters + pagingYesSlowDIY
PDF + abstract linksYesManualYes
Pay per resultYesβ€”β€”

Input

Use the simple fields, or a raw searchQuery for full arXiv syntax.

FieldTypeDescription
allFieldsstringKeyword across title/abstract/authors
titlestringTitle contains
authorstringAuthor name
abstractstringAbstract contains
categorystringarXiv category (e.g. cs.LG, cs.CL, cs.AI)
searchQuerystringAdvanced raw query (overrides the above)
sortBystringRelevance / Newest / Recently updated
maxResultsintegerMax papers to return

Example: newest LLM papers

{
"allFields":"large language models",
"sortBy":"newest",
"maxResults":100
}

Example: a category, advanced syntax

{
"searchQuery":"cat:cs.CL AND abs:retrieval augmented",
"sortBy":"newest",
"maxResults":200
}

Sample output

{
"arxiv_id":"2605.30351v1",
"title":"VideoMLA: Low-Rank Latent KV Cache for Minute-Scale Video",
"authors":["Hidir Yesiltepe","Jiazhen Hu"],
"primary_category":"cs.CV",
"categories":["cs.CV","cs.AI"],
"published":"2026-05-28T17:59:57Z",
"abstract":"Long-rollout causal video diffusion...",
"pdf_url":"https://arxiv.org/pdf/2605.30351v1",
"abstract_url":"https://arxiv.org/abs/2605.30351v1"
}

Use cases

  • AI/ML research β€” track the latest papers in a field or category
  • RAG / LLM datasets β€” build corpora of abstracts + PDF links by topic
  • Literature reviews β€” gather and rank relevant papers fast
  • Research analytics β€” analyse output by category, author and time

Pricing

Pay-per-result. You are charged only for the papers returned β€” empty runs cost nothing.

Notes & legal

  • Uses the official arXiv API. Please respect arXiv's API terms and rate limits (the Actor waits between requests).
  • Use data only for lawful purposes.

Related actors

More scrapers from the same author:

You might also like

πŸ“„ arXiv Papers Monitor

skootle/arxiv-papers

Pull new AI / ML / CS / physics / math papers from arXiv as they land via the official arXiv API. Title, abstract, authors, PDF link, DOI, and LLM-ready summary card per paper. For ML researchers, AI agents, and journalists. Export, run via API, schedule, or integrate with other tools.

ArXiv Paper Scraper

sheshinmcfly/arxiv-paper-scraper

Search and extract scientific papers from ArXiv.org across any field. Returns title, authors, full abstract, PDF link, arXiv ID, categories, and submission date. Ideal for AI research monitoring, RAG pipelines, literature reviews, and academic trend analysis. No API key needed.

ArXiv Paper Search

gentle_cloud/arxiv-paper-search

Search and extract academic papers from ArXiv. Find papers by keyword, author, or category with full metadata including title, authors, abstract, categories, and PDF links.

10

arXiv Scraper

artificially/arxiv-scraper

Search and extract academic papers from arXiv.org. Get paper titles, authors, abstracts, categories, and PDF links for AI/ML, physics, math, and more.

arXiv Research Paper Scraper

crawlerbros/arxiv-research-paper-scraper

Scrape research papers from arXiv.org - search by query, category, or author; lookup by arXiv ID. Returns title, authors, abstract, PDF URL, DOI, categories, and more. Uses the public arXiv Atom API. No login or proxy required.

πŸ“„ ArXiv Scraper β€” Preprints & Research Data

nexgendata/arxiv-scraper

Extract papers from ArXiv β€” titles, abstracts, authors, categories & PDF links. Monitor new AI, physics, math & CS research. Build tracking & literature review tools. Pay per paper.

arXiv Metadata Collectorβ€” Metadata, PDF, Authors & Abstract

scrapepilot/arxiv-metadata-collector---metadata-pdf-authors-abstract

Scrape arXiv research papers with metadata including title, authors, abstract, PDF links, DOI, and categories. Supports keyword search, proxy integration, and structured dataset output for AI, ML, and academic research use

arXiv Papers Scraper

crawlerbros/arxiv-papers-scraper

Scrape academic preprints from arXiv.org by keyword, author, or category. Returns clean records with title, authors, abstract, categories, PDF URL, DOI. HTTP-only via the public arXiv API. No login, no proxy.