VOOZH about

URL: https://www.scriptbyai.com/extract-structured-data/

⇱ Extract Structured Data from Any Site with JsonGenius And AI


Skip to content

JsonGenius is a robust, self-hosted, AI-powered scraping API written in Go that makes it easy to extract structured data (defined by a JSON Schema) from any webpage. It uses Chromium to render pages like a normal browser, so it works on complex sites.

With JsonGenius and Docker, you can set up a scraping API to pull data from sites in just a few minutes. Define the schema you want, send a URL, and JsonGenius will return extracted data matching your schema. This makes it easy to collect and work with all kinds of web data.

How to use it:

1. Clone JsonGenius from Github and navigating to the jsongenius directory:

git clone https://github.com/semanser/jsongenius
cd jsongenius

2. Insert your OpenAI API Key:

export OPEN_AI_KEY=...

3. Run docker-compose up and the API will be available at http://localhost:3001.

4. To scrape a website, provide its URL and a desired JSON Schema to extract data:

curl -X POST -H "Content-Type: application/json" -d '{
 "url": "/path/to/",
 "schema": {
 "type": "object",
 "properties": {
 "products": {
 "type": "array",
 "items": {
 "type": "object",
 "properties": {
 "name": {
 "type": "string",
 "description": "The product name"
 },
 "price": {
 "type": "number",
 "description": "The price of the product in USD"
 }
 }
 }
 }
 }
 }
}' http://localhost:3001/lookup

Leave a ReplyCancel Reply

Trending now

Get the latest & top AI tools sent directly to your email.

Subscribe now to explore the latest & top AI tools and resources, all in one convenient newsletter. No spam, we promise!