Pricing
from $10.00 / 1,000 results
Universal Data Structure Converter
A production-grade Apify actor that converts between HTML, XML, CSV, YAML, and JSON formats. Supports 9+ conversion types with smart auto-detection, nested JSON flattening, HTML table scraping, batch URL processing, and full customization.
Pricing
from $10.00 / 1,000 results
Rating
0.0
(0)
Developer
Actor stats
0
Bookmarked
2
Total users
1
Monthly active users
2 months ago
Last modified
Categories
Share
๐ Universal Data Structure Converter โ Apify Actor
A production-grade Apify actor that converts between HTML, XML, CSV, YAML, and JSON formats. Supports 9+ conversion types with smart auto-detection, nested JSON flattening, HTML table scraping, batch URL processing, and full customization.
๐ Supported Conversions
| # | Conversion | Description |
|---|---|---|
| 1 | HTML โ JSON | Parse DOM tree or extract <table> data |
| 2 | XML โ JSON | Full tree with attributes & namespaces |
| 3 | CSV โ JSON | With auto type-casting (int/float/bool) |
| 4 | YAML โ JSON | Single or multi-document streams |
| 5 | JSON โ XML | Custom root/item tags, XML declaration |
| 6 | JSON โ CSV | Nested object flattening to dot-columns |
| 7 | JSON โ YAML | Block or flow style output |
| 8 | YAML โ XML | Chained (YAML โ JSON โ XML) |
| 9 | CSV โ XML | Chained (CSV โ JSON โ XML) |
โจ Key Features
- Auto-Detection โ Set conversion to
autoand the actor detects whether input is HTML, XML, JSON, YAML, or CSV - URL Fetching โ Provide a list of URLs to fetch and convert in batch
- HTML Table Scraping โ Extract
<table>elements directly into structured JSON arrays - Smart Type-Casting โ CSV values like
"30","true","99.5"auto-cast toint,bool,float - Nested Flattening โ
{"a": {"b": 1}}becomes CSV columna.bwhen exporting JSON โ CSV - Proxy Support โ Use Apify Proxy for fetching URLs behind firewalls
- Custom Delimiters โ Comma, tab, semicolon, pipe for CSV input/output
- Pretty-Print or Minify โ Configurable indentation or compact output
๐ Input Schema
| Parameter | Type | Default | Description |
|---|---|---|---|
conversionType | string | auto | Conversion to perform (or auto to detect) |
outputFormat | string | json | Target format when using auto-detect |
inputData | string | (sample) | Raw data to convert (paste directly) |
sourceUrls | array | [] | URLs to fetch and convert in batch |
csvDelimiter | string | , | CSV column separator |
csvHasHeader | boolean | true | Treat first CSV row as column names |
typeCast | boolean | true | Auto-cast CSV strings to native types |
flattenNested | boolean | true | Flatten nested JSON for CSV export |
flattenSeparator | string | . | Separator for flattened key names |
xmlRootTag | string | root | Root element name for XML output |
xmlListItemTag | string | item | Tag for array items in XML output |
xmlDeclaration | boolean | true | Include XML <?xml?> header |
xmlStripNamespaces | boolean | true | Remove namespace prefixes from XML tags |
htmlExtractTables | boolean | false | Extract only <table> elements from HTML |
htmlParser | string | lxml | BeautifulSoup parser engine |
yamlMultiDoc | boolean | false | Parse multi-document YAML streams |
indent | integer | 2 | Spaces for pretty-printing (0-8) |
minify | boolean | false | Compact output (overrides indent) |
outputAsString | boolean | false | Store result as raw string instead of parsed JSON |
proxyConfiguration | object | disabled | Proxy settings for URL fetching |
๐ Usage Examples
Example 1: Convert CSV โ JSON (default)
Just run the actor with defaults โ it ships with sample CSV data and auto-detects the conversion:
{"conversionType":"auto","outputFormat":"json"}
Example 2: HTML Table Scraping
{"conversionType":"html2json","inputData":"<table><tr><th>Name</th><th>Age</th></tr><tr><td>Alice</td><td>30</td></tr></table>","htmlExtractTables":true}
Example 3: Batch URL Processing
{"conversionType":"auto","outputFormat":"json","sourceUrls":[{"url":"https://example.com/data.csv"},{"url":"https://api.example.com/config.yaml"}]}
Example 4: JSON โ CSV with Flattening
{"conversionType":"json2csv","inputData":"[{\"id\":1,\"name\":\"Alice\",\"address\":{\"city\":\"NYC\",\"zip\":\"10001\"}}]","flattenNested":true,"flattenSeparator":"."}
Example 5: XML โ JSON (Strip Namespaces)
{"conversionType":"xml2json","inputData":"<?xml version='1.0'?><catalog><book id='1'><title>Hello</title></book></catalog>","xmlStripNamespaces":true}
๐ค Output Format
Each converted item is stored in the dataset with this structure:
{"source":"inline_input","conversion":"csv2json","inputFormat":"csv","outputFormat":"json","timestamp":"2026-04-01T17:30:00.000Z","status":"success","error":null,"data":[ ... ]}
dataโ Parsed result (for JSON outputs)rawOutputโ Raw string result (for XML/CSV/YAML outputs, or whenoutputAsStringis true)statusโ"success"or"failed"errorโ Error message if conversion failed
Run statistics are stored in the Key-Value Store under the key RUN_STATS.
๐ Local Development
# Clone and installcd apify-data-converterpip install-r requirements.txt# Run locally with Apify CLIapify run --input-file=input.json
๐ฆ Dependencies
apifyโ Apify SDK for Pythonhttpxโ Async HTTP client for URL fetchingpyyamlโ YAML parsing and serializationbeautifulsoup4+lxmlโ HTML parsinghtml5libโ Lenient HTML parser for broken markup
