VOOZH about

URL: https://www.geeksforgeeks.org/python/extract-json-from-html-using-beautifulsoup-in-python/

⇱ Extract JSON from HTML using BeautifulSoup in Python - GeeksforGeeks


  • Courses
  • Tutorials
  • Interview Prep

Extract JSON from HTML using BeautifulSoup in Python

Last Updated : 23 Jul, 2025

In this article, we are going to extract JSON from HTML using BeautifulSoup in Python.

Module needed

  • : Beautiful Soup(bs4) is a Python library for pulling data out of HTML and XML files. This module does not come built-in with Python. To install this type the below command in the terminal.
pip install bs4
  • : Request allows you to send HTTP/1.1 requests extremely easily. This module also does not come built-in with Python. To install this type the below command in the terminal.
pip install requests

Approach:

  • Import all the required modules.
  • Pass the URL in the get function(UDF) so that it will pass a GET request to a URL, and it will return a response.

Syntax: requests.get(url, args)

  • Now Parse the HTML content using bs4.

Syntax: BeautifulSoup(page.text, 'html.parser')

Parameters:

  • page.text : It is the raw HTML content.
  • html.parser : Specifying the HTML parser we want to use.
  • Now get all the required data with find() function.

Now find the customer list with li, a, p tag where some unique class or id. You can open the webpage in the browser and inspect the relevant element by pressing right-click as shown in the figure.

👁 Image
  • Create a Json file and use json.dump() method to convert python objects into appropriate JSON objects.

Below is the full implementation:

Output:

Created Json File

Our JSON file output:

👁 Image
Comment