Pricing
from $3.00 / 1,000 results
Figshare Scraper
This actor extracts metadata and content information from Figshare, one of the world's largest open research data repositories. It supports full-text keyword search, direct article ID lookup, and institution-specific article browsing across all Figshare content types.
Pricing
from $3.00 / 1,000 results
Rating
0.0
(0)
Developer
Actor stats
0
Bookmarked
2
Total users
1
Monthly active users
14 days ago
Last modified
Categories
Share
Scrape research articles, datasets, code, figures, and more from Figshare using its public REST API โ no authentication or proxy required.
What It Does
This actor extracts metadata and content information from Figshare, one of the world's largest open research data repositories. It supports full-text keyword search, direct article ID lookup, and institution-specific article browsing across all Figshare content types.
Key Features
- No authentication required โ uses the public Figshare API
- Three modes: keyword search, fetch by ID, or browse by institution
- Filter by content type: datasets, papers, theses, code, figures, presentations, and more
- Full detail extraction: authors, categories, tags, license, files, DOI
- HTML description stripping โ descriptions are returned as clean plain text
- Automatic pagination โ retrieves up to 1,000 results per run
Input Fields
| Field | Type | Description |
|---|---|---|
mode | Select | searchArticles (default), getById, or searchByInstitution |
searchQuery | String | Keyword query (e.g. climate change, CRISPR) |
itemType | Select | Filter by content type (see table below) |
articleIds | Array | Figshare article IDs for getById mode |
institutionId | String | Institution numeric ID for searchByInstitution mode |
maxItems | Integer | Max results to return (1โ1000, default 50) |
Item Types
| Value | Type |
|---|---|
| (empty) | All types |
1 | Figure |
2 | Media |
3 | Dataset |
4 | Poster |
5 | Paper |
6 | Presentation |
7 | Thesis |
8 | Code |
9 | Preprint |
Output Fields
Each item in the dataset contains:
| Field | Type | Description |
|---|---|---|
articleId | Integer | Unique Figshare article ID |
title | String | Article title |
description | String | Plain-text abstract/description (HTML stripped) |
doi | String | Digital Object Identifier |
publishedDate | String | Publication date (ISO 8601) |
modifiedDate | String | Last modified date (ISO 8601) |
authors | Array | List of author full names |
tags | Array | Keyword tags |
categories | Array | Subject category names |
license | String | License name (e.g. CC BY 4.0) |
figshareUrl | String | Public URL on Figshare |
sourceUrl | String | Same as figshareUrl |
thumbUrl | String | Thumbnail image URL |
downloadUrl | String | Direct download URL for the primary file |
itemType | String | Content type name (e.g. dataset, paper) |
citation | String | Full citation string |
viewCount | Integer | Number of views (when available) |
citationCount | Integer | Number of citations (when available) |
recordType | String | Always "article" |
scrapedAt | String | Timestamp when the record was scraped |
Example Output
{"articleId":32513898,"title":"Perspectives of Allied Health Professionals on Digital Health Services in South Australia","description":"This research project explores the perspectives of allied health professionals on digital health services in South Australia.","doi":"10.25909/32513898.v1","publishedDate":"2026-05-30T02:03:44Z","modifiedDate":"2026-05-30T02:03:44Z","authors":["Muhammad Khan"],"tags":["digital health activism"],"categories":["Audiology","Occupational therapy","Physiotherapy","Speech pathology"],"license":"CC BY 4.0","figshareUrl":"https://adelaide.figshare.com/articles/poster/...","sourceUrl":"https://adelaide.figshare.com/articles/poster/...","thumbUrl":"https://s3-eu-west-1.amazonaws.com/ppreviews-adelaide-.../thumb.png","downloadUrl":"https://ndownloader.figshare.com/files/65104542","itemType":"poster","citation":"Khan, Muhammad (2026). Perspectives of Allied Health...","recordType":"article","scrapedAt":"2026-05-30T10:00:00.000000+00:00"}
Use Cases
- Research discovery: Find datasets or papers by keyword across all disciplines
- Data science: Collect open datasets for analysis and modeling
- Literature reviews: Gather papers, theses, and preprints by topic
- Institution profiling: Browse all public outputs from a specific university
- Open science auditing: Track open data and code availability by subject
- Citation analysis: Collect DOIs for downstream citation graph work
FAQs
Q: Does this require an API key? A: No. The Figshare public API is freely accessible without authentication.
Q: How many results can I get? A: Up to 1,000 per run. Figshare hosts millions of items.
Q: What does searchByInstitution mode do?
A: It retrieves all public articles published by a specific institution on Figshare. You need the institution's numeric ID (e.g., University of Manchester = 2).
Q: Are descriptions HTML-free? A: Yes. The actor strips all HTML tags from descriptions, returning clean plain text.
Q: Why is viewCount sometimes absent?
A: The Figshare public API does not always return view/citation statistics in the article detail endpoint. The field is only included when the data is available.
Q: Can I search for code repositories?
A: Yes โ set itemType to 8 (Code) and use any keyword in searchQuery.
Q: How fresh is the data? A: The Figshare API returns live data. New uploads appear within minutes of publication.
