Pricing
from $3.00 / 1,000 results
OSF Preprints Scraper
This actor extracts preprint metadata from OSF's preprint archive, which hosts over 190,000 open-access scholarly works across disciplines including psychology, medicine, social sciences, engineering, and more. It supports filtering by tags, subjects, and provider, as well as direct ID-based lookup.
Pricing
from $3.00 / 1,000 results
Rating
0.0
(0)
Developer
Actor stats
0
Bookmarked
2
Total users
1
Monthly active users
14 days ago
Last modified
Categories
Share
Scrape preprints from the Open Science Framework (OSF) using its public REST API β no authentication or proxy required.
What It Does
This actor extracts preprint metadata from OSF's preprint archive, which hosts over 190,000 open-access scholarly works across disciplines including psychology, medicine, social sciences, engineering, and more. It supports filtering by tags, subjects, and provider, as well as direct ID-based lookup.
Key Features
- No authentication required β uses the public OSF API
- Two modes: search/browse preprints or fetch specific ones by ID
- Filter by tags, subjects, or provider (e.g., PsyArXiv, SocArXiv, MedArXiv)
- Pagination handled automatically β retrieves up to 1,000 records per run
- Clean structured output with camelCase field names
Input Fields
| Field | Type | Description |
|---|---|---|
mode | Select | searchPreprints (default) or getById |
searchQuery | String | Filter preprints by tag (e.g. machine learning) |
subjectFilter | String | Filter by subject text (e.g. Medicine and Health Sciences) |
provider | String | Filter by provider (e.g. psyarxiv, socarxiv, osf) |
preprintIds | Array | List of OSF preprint IDs (for getById mode) |
maxItems | Integer | Max number of results (1β1000, default 50) |
Provider Examples
Popular OSF preprint providers you can filter by:
| Provider ID | Description |
|---|---|
osf | General OSF preprints |
psyarxiv | Psychology |
socarxiv | Social sciences |
medarxiv | Medicine |
eartharxiv | Earth sciences |
engrxiv | Engineering |
biorxiv | Biology |
ecsarxiv | Electrochemical Society |
Output Fields
Each item in the dataset contains:
| Field | Type | Description |
|---|---|---|
preprintId | String | Unique OSF preprint ID (e.g. abc12_v2) |
title | String | Title of the preprint |
description | String | Abstract or summary |
doi | String | Digital Object Identifier |
datePublished | String | Publication date (ISO 8601) |
dateCreated | String | Creation date (ISO 8601) |
dateModified | String | Last modified date (ISO 8601) |
tags | Array | Author-assigned tags |
isPublished | Boolean | Whether the preprint is publicly published |
provider | String | Provider ID (e.g. psyarxiv) |
subjects | Array | Subject classifications |
license | String | License name (e.g. CC-By Attribution 4.0) |
sourceUrl | String | Direct URL to the preprint on OSF |
recordType | String | Always "preprint" |
scrapedAt | String | Timestamp when the record was scraped |
Example Output
{"preprintId":"snveb_v2","title":"Beyond the Resume: Comparing the Predictive Power of Personality Assessments","description":"This study examines employee turnover prediction using machine learning...","doi":"10.31234/osf.io/snveb_v2","datePublished":"2026-05-26T13:58:36.783000Z","dateCreated":"2026-05-25T09:31:34.214181Z","dateModified":"2026-05-26T13:58:36.814700Z","tags":["Machine learning","Employee turnover","Explainable AI"],"isPublished":true,"provider":"psyarxiv","subjects":["Industrial and Organizational Psychology","Quantitative Methods"],"sourceUrl":"https://osf.io/preprints/psyarxiv/snveb_v2/","recordType":"preprint","scrapedAt":"2026-05-30T10:00:00.000000+00:00"}
Use Cases
- Academic research: Track preprints in specific fields
- Literature reviews: Collect papers by subject or tag for systematic reviews
- Trend analysis: Monitor publication rates by subject over time
- Citation tracking: Gather DOIs for downstream citation analysis
- Content aggregation: Build databases of open-access scholarly works
FAQs
Q: Does this require an API key? A: No. The OSF public API is freely accessible without authentication.
Q: How many results can I get? A: Up to 1,000 per run. OSF has 190,000+ preprints total.
Q: Can I filter by date?
A: Not directly via this actor's inputs. You can filter by tag and subject, then sort results by datePublished in post-processing.
Q: What's the difference between providers?
A: Different academic communities host preprint servers on OSF (e.g., PsyArXiv for psychology). Using the provider filter restricts results to that community.
Q: Are all preprints peer-reviewed?
A: No β preprints are pre-peer-review. The isPublished field indicates OSF server acceptance, not journal peer review.
Q: How current is the data? A: The OSF API returns live data. New preprints appear within hours of submission.
