arXiv Paper & Author Scraper

Under maintenance

Pricing

Pay per usage

Try for free

Go to Apify Store

👁 arXiv Paper & Author Scraper

arXiv Paper & Author Scraper

Under maintenance

Try for free

Extract academic papers, abstracts, and author details from arXiv using the official API. Ideal for research monitoring, literature reviews, and building academic datasets.

Pricing

Pay per usage

Rating

0.0

(0)

Developer

👁 Automly

Automly

Maintained by Community

Actor stats

Bookmarked

Total users

Monthly active users

a month ago

Last modified

Why use this actor?

Official API reliability — Uses the arXiv export API for stable, structured data without scraping complexity.
Research monitoring — Track new papers in specific fields or by keyword.
Literature reviews — Collect abstracts, authors, and categories for systematic analysis.
Academic lead generation — Build lists of researchers and their affiliations by topic.
RAG & AI pipelines — Feed paper abstracts and metadata into vector databases for semantic search.

Features

Search papers by free-text query or arXiv category codes
Filter by date range (last week, last month, last year, or custom range)
Sort by relevance, submission date, or last updated date
Extract full abstracts and author lists with affiliations
Output authors as separate records for easy analysis
Respects arXiv polite usage policy with built-in rate limiting

Input

Field	Type	Default	Description
searchQuery	string	—	arXiv search query, e.g. `machine learning` or `cat:cs.AI`
categories	array	—	List of arXiv category codes, e.g. `["cs.AI", "cs.LG"]`
dateRange	string	—	`lastWeek`, `lastMonth`, `lastYear`, or `YYYY-MM-DD TO YYYY-MM-DD`
maxResults	integer	100	Maximum papers to return (1–500)
extractAuthors	boolean	true	Include author records as separate rows
extractAbstract	boolean	true	Include paper abstracts
sortBy	string	relevance	`relevance`, `lastUpdatedDate`, or `submittedDate`
sortOrder	string	descending	`ascending` or `descending`

Example input

{
"searchQuery":"large language models",
"categories":["cs.CL","cs.AI"],
"dateRange":"lastMonth",
"maxResults":50,
"extractAuthors":true,
"sortBy":"submittedDate",
"sortOrder":"descending"
}

Output

Each record includes a type field to distinguish entities.

Paper

Field	Type	Description
type	string	`paper`
arxivId	string	arXiv identifier
url	string	arXiv abstract page URL
pdfUrl	string	Direct PDF URL
title	string	Paper title
abstract	string	Paper abstract
publishedAt	string	ISO 8601 submission date
updatedAt	string	ISO 8601 last update date
authors	array	List of `{name, affiliation}` objects
categories	array	arXiv category codes
primaryCategory	string	Primary arXiv category

Author

Field	Type	Description
type	string	`author`
arxivId	string	Associated paper identifier
paperTitle	string	Associated paper title
name	string	Author name
affiliation	string	Author affiliation

Limits and caveats

arXiv API returns up to 100 results per request; the actor paginates automatically.
A 3-second delay is enforced between requests to respect arXiv's polite usage policy.
Only publicly available papers are returned.
Author affiliations are only available when provided by the submitter.

Pricing

This actor uses Pay Per Event pricing. You are charged only for successfully extracted data.

Event	Price	Description
Paper scraped	$0.003	Each paper successfully extracted
Author scraped	$0.001	Each author record successfully extracted

Tiered discounts apply based on your Apify subscription level. A small actor-start fee may also apply.

FAQ

Do I need an arXiv account? No. The arXiv API is completely open and requires no authentication.

Can I download the full PDF? The actor returns direct PDF URLs in the pdfUrl field. You can download them separately.

What categories are available? arXiv uses codes like cs.AI (Artificial Intelligence), cs.LG (Machine Learning), cs.CL (Computation and Language), physics.gen-ph, math.ST, etc. See the full list at arxiv.org.

How recent is the data? Data reflects the current arXiv index at the time of the run. New papers are typically available within minutes of submission.

ArXiv Academic Paper Scraper

fortuitous_pirate/arxiv-scraper

Scrape academic papers from ArXiv. Extract titles, authors, abstracts, categories, and PDF links. Essential for research and literature reviews.

👁 User avatar

Fortuitous Pirate

arXiv Paper Scraper

skystone_labs/arxiv-scraper

Extract research papers from arXiv using the official API. Get titles, authors, abstracts, PDF URLs, categories, and more. Perfect for research datasets and literature reviews.

👁 User avatar

Skystone

arXiv Paper Scraper

cloud9_ai/arxiv-paper-scraper

Scrape academic papers from arXiv.org. Search by keyword, browse categories, or get latest papers. Extract titles, abstracts, authors, PDF links, and citation data via arXiv API.

👁 User avatar

cloud9

arXiv Papers Scraper with AI Topic Tags

and_krm/arxiv-scraper

Search arXiv.org for academic papers by keyword, author, or category. Get clean structured data with optional AI topic tagging via Claude. Perfect for literature reviews, research monitoring, and academic datasets.

👁 User avatar

Andrei

👁 arXiv Search Scraper 📚 avatar

arXiv Search Scraper 📚

easyapi/arxiv-search-scraper

Extract comprehensive research paper data from arXiv search results. Get detailed metadata including titles, authors, abstracts, categories and more. Perfect for academic research monitoring, trend analysis and building paper databases. 🎓📚

👁 User avatar

EasyApi

👁 ArXiv Paper Search avatar

ArXiv Paper Search

gentle_cloud/arxiv-paper-search

Search and extract academic papers from ArXiv. Find papers by keyword, author, or category with full metadata including title, authors, abstract, categories, and PDF links.

👁 User avatar

Monkey Coder

arXiv Paper Scraper

lulzasaur/arxiv-scraper

Search and scrape arXiv academic papers. Get titles, authors, abstracts, categories, PDF links, DOIs. Search by keyword, browse recent papers by category, or fetch by arXiv ID.

👁 User avatar

lulz bot

👁 ArXiv Paper Scraper avatar

ArXiv Paper Scraper

sheshinmcfly/arxiv-paper-scraper

Search and extract scientific papers from ArXiv.org across any field. Returns title, authors, full abstract, PDF link, arXiv ID, categories, and submission date. Ideal for AI research monitoring, RAG pipelines, literature reviews, and academic trend analysis. No API key needed.

👁 User avatar

Sheshinmcfly

👁 arXiv Paper Scraper avatar

arXiv Paper Scraper

plantane/arxiv-scraper

Scrape research papers from arXiv by search query or category. Get titles, abstracts, authors, categories, and PDF links via the public arXiv API.

👁 User avatar

Daniel

👁 Arxiv Keyword Spider avatar

Arxiv Keyword Spider

getdataforme/arxiv-keyword-spider

Arxiv Keyword Spider efficiently scrapes arXiv.org for research papers using keywords, delivering comprehensive metadata like titles, authors, abstracts, and categories. Perfect for academic research, market analysis, and trend monitoring....

👁 User avatar

GetDataForMe

URL: https://apify.com/automly/arxiv-paper-scraper