XML stands for Extensible Markup Language. It was designed to store and transport data. It was designed to be both human- and machine-readable. Thatβs why, the design goals of XML emphasize simplicity, generality, and usability across the Internet.
Note: For more information, refer to
XML | Basics
Here we consider that the XML file is present in the memory. Please read the comments in the code for a clear understanding.
XML File:
π python-parse-xml
Let us save the above XML file as
"test.xml". Before going further you should know that
in XML we do not have predefined tags as we have in HTML. While writing XML the
author has to define his/her own tags as well as the document structure. Now we need to parse this file and modify it using Python. We will be using
"minidom" library of
Python 3 to do the above task. This module does not come built-in with Python. To install this type the below command in the terminal.
pip install minidom
Reading XML
First we will be reading the contents of the XML file and then we will learn how to modify the XML file.
Example
Output:
#document
note
Name: Jack
Surname: Shelby
Favourite Game: Football
Messi
Ronaldo
Mbappe
In the above Python code while printing
First Name or
Last Name we have used
firstname[0] / lastname[0]. This is because there is only 1
"fname" and only 1
"lname" tag. For
multiple same tags we can proceed like below.
XML:
π python-parse-xml-2
Python
Output
Jack
John
Harry
Modifying XML
Now we have got a basic idea on how we can parse and read the contents of a XML file using Python. Now let us learn to
modify an XML file.
XML File:
π python-parse-xml-3
Let us
add the following :
- Height
- Languages known by Jack
Let us
delete the
"hobby" tag. Also let us
modify the
age to
29.
Python Code:(Modifying XML)
Output:
π python-parse-xml-4
The last 3 lines of the Python code just converts the
"file" object into XML using the
toxml() method and writes it to the
"test.xml" file. If you do not want to edit the original file and just want to print the modified XML then
replace those 3 lines by:
print(file.toxml())