VOOZH about

URL: https://apify.com/pocesar/json-downloader

โ‡ฑ API / JSON scraper ยท Apify


Pricing

$5.00/month + usage

Go to Apify Store

Scrape any API / JSON URLs directly to the dataset, and return them in CSV, XML, HTML, or Excel formats. Transform and filter the output. Enables you to follow pagination recursively from the payload without the need to visit the HTML page.

Pricing

$5.00/month + usage

Rating

0.0

(0)

Developer

๐Ÿ‘ Paulo Cesar

Paulo Cesar

Maintained by Community

Actor stats

10

Bookmarked

550

Total users

3

Monthly active users

a year ago

Last modified

Share

Download and format JSON endpoint data

Download any JSON URLs directly to the dataset, and return them in CSV, XML, HTML, or Excel formats. Transform and filter the output.

Features

  • Optimized, fast and lightweight
  • Small memory requirement
  • Works only with JSON payloads
  • Easy recursion
  • Filter and map complex JSON structures
  • Comes enabled with helper libraries: lodash, moment
  • Full access to your account resources through Apify variable
  • The run fails if all requests failed

Handling errors

This scraper is different from cheerio-scraper that you can handle the errors before the handlePageFunction fails. Using the handleError input, you can enqueue extra requests before failing, allowing you to recover or trying a different URL.

{
handleError:async({ addRequest, request, response, error })=>{
request.noRetry = error.message.includes('Unexpected')|| response.statusCode ==404;
addRequest({
url:`${request.url}?retry=true`,
});
}
}

Filter Map function

This function can filter, map and enqueue requests at the same time. The difference is that the userData from the current request will pass to the next request.

const startUrls =[{
url:"https://example.com",
userData:{
firstValue:0,
}
}];
// assuming the INPUT url above
awaitApify.call('pocesar/json-downloader',{
filterMap:async({ request, addRequest, data })=>{
if(request.userData.isPost){
// userData will be inherited from previous request
request.userData.firstValue ==0;
// return the data only after the POST request
return data;
}else{
// add the same request, but as a POST
addRequest({
url:`${request.url}/?method=post`,
method:'POST',
payload:{
username:'username',
password:'password',
},
headers:{
'Content-Type':'application/json',
},
userData:{
isPost:true
}
});
// omit return or return a falsy value will ignore the output
}
},
})

Examples

Flatten an object

{
filterMap:async({ flattenObjectKeys, data })=>{
returnflattenObjectKeys(data);
}
}
/**
* an object like
* {
* "deep": {
* "nested": ["state", "state1"]
* }
* }
*
* becomes
* {
* "deep.nested.0": "state",
* "deep.nested.1": "state1"
* }
*/

Submit a JSON API with POST

{
"startUrls": [
{
"url": "https://ow0o5i3qo7-dsn.algolia.net/1/indexes/prod_PUBLIC_STORE/query?x-algolia-agent=Algolia%20for%20JavaScript%20(4.13.0)%3B%20Browser%20(lite)&x-algolia-api-key=0ecccd09f50396a4dbbe5dbfb17f4525&x-algolia-application-id=OW0O5I3QO7",
"method": "POST",
"payload": "{\"query\":\"instagram\",\"page\":0,\"hitsPerPage\":24,\"restrictSearchableAttributes\":[],\"attributesToHighlight\":[],\"attributesToRetrieve\":[\"title\",\"name\",\"username\",\"userFullName\",\"stats\",\"description\",\"pictureUrl\",\"userPictureUrl\",\"notice\",\"currentPricingInfo\"]}",
"headers": {
"content-type": "application/x-www-form-urlencoded"
}
}
]
}

Follow pagination from payload

{
filterMap:async({ addRequest, request, data })=>{
if(data.nbPages >1&& data.page < data.nbPages){
// get the current payload from the input
const payload =JSON.parse(request.payload);
// change the page number
request.payload ={...payload,page: data.page +1};
// add the request for parsing the next page
addRequest(request);
}
return data;
}
}

Omit output if condition is met

{
filterMap:async({ addRequest, request, data })=>{
if(data.hits.length <10){
return;
}
return data;
}
}

Unwind array of results, each item from the array in a separate dataset item

{
filterMap:async({ addRequest, request, data })=>{
return data.hits;// just return an array from here
}
}

You might also like

Pay-as-you-go API / JSON scraper

pocesar/pay-as-you-go-api-json-scraper

Scrape as pay-as-you-go any API / JSON URLs directly to the dataset, and return them in CSV, XML, HTML, or Excel formats. Transform and filter the output. Enables you to follow pagination recursively from the payload without the need to visit the HTML page.

106

HTML to JSON Smart Parser

parseforge/html-to-json-smart-parser

Convert HTML to structured JSON using AI! Uses OpenAI to extract and structure data from HTML into clean JSON format. Perfect for developers and data analysts who need to transform HTML into structured data without manual parsing.

40

5.0

Price Drop Tracker - Monitor Any E-commerce Product

alizarin_refrigerator-owner/price-drop-tracker---monitor-any-e-commerce-product

Actor for scraping data from a single web page. The URL of the web page is passed in via input, defined by the input schema. It uses the Axios client to get the HTML of the page & the Cheerio library to parse the data from it. The data are then stored in a dataset where you can easily access them.

Download HTML from URLs

datapilot/download-html-from-urls

This script with an Apify Actor to fetch the complete HTML source of any website. The user provides a URL, the page is loaded with JavaScript execution, the full HTML is printed in the terminal, saved to an HTML file,

XMLs To Dataset

mtrunkat/xmls-to-dataset

Go to actor anytime you need to download XML files and store them in the dataset.

๐Ÿ‘ User avatar

Marek Trunkรกt

114

HTML Scraper

making-data-meaningful/html-scraper

Access and extract full HTML source code from any webpage instantly. The HTML Scraper API lets you retrieve clean, accurate page HTML for SEO analysis, web scraping, and content monitoring - all without being blocked.

Data Converter โ€” JSON, CSV & XML

accurate_pouch/data-converter

Convert between JSON, CSV, and XML formats in bulk. JSON to CSV, CSV to JSON, JSON to XML, XML to JSON. Handles quoted fields, nested objects. $0.003/conversion.

๐Ÿ‘ User avatar

Manchitt Sanan

2

HTML Scraper pro

scrapingxpert/html-scraper-pro

The HTML Scraper Pro is a powerful tool designed to extract the HTML source code and metadata from websites. It uses advanced web scraping techniques to retrieve the full HTML content of web pages,page title and HTTP status code.This tool is ideal for data extraction, website analysis, and archiving

311

5.0

XML to JSON Converter

eloquent_mountain/xml-to-json-converter

XML to JSON Converter Convert any XML file to JSON effortlessly with this Apify actor. Handle complex XML structures and transform them into structured JSON data. Supports input via URL or direct text paste, making it easy to integrate into your data processing workflows.

Twitter Scraper Cheap

microworlds/twitter-scraper-cheap

Tweet Scraper Cheap searches tweets at the speed of light (with historical data from 2006) - and parses and converts the data to structured formats: HTML table, JSON, CSV, Excel, and XML.

Related articles

Dataset processing on Apify
Read more
How to scrape a website (ultimate guide for 2025)
Read more
AI data collection (how to feed your LLM)
Read more