Skip to Main Content

File Conversion

This guide helps users convert to / from different file formats for your project needs.

XML Conversion Using Python

What is XML?

  • XML stands for Extensible Markup Language
  • Commonly used for distributing data over the Internet
  • Used to store data in the form of hierarchical elements

Example XML file:

Download catalog.xml

In this case: <catalog> is called the root element, which is where the XML tree starts and from which child elements branch out from.

Resources

Python to XML

Python To XML (Writing to XML Files)

  • To write to an XML file, there are multiple methods we can use to create the hierarchy of elements. The .write() method writes the element tree to an XML file in the same folder as the current directory/folder you are running the python file in. 

           

  • ET.Element(root): Creates an element for the element tree and provides the root element as the parameter.
  • ET.SubElement(parent, tag, attribute): Creates a subelement with parameters of the parent element of that subelement, the tag for the subelement, and attributes of the subelement.
  • ET.ElementTree(root): Creates the element tree with the root of the element tree as the parameter.

XML to Python

XML To Python (Reading in XML Files)

1. We use Python’s built-in module xml.etree.ElementTree to parse XML data. First, we need to import the module. It’s convention to import xml.etree.ElementTree as ET. 

                      

2. First, we have to create an ElementTree object like this: 

                      

                      If we print catalogTree, we get the ElementTree object:

                      

3. Then, we have to get the root element, which in our example above is catalog. We can do this with the following line: 

                     

                     If we print the root, we get this: 

                      

  1. Now, we can search for child tags and return element objects in a list (or specific element objects) using the following methods.
    1. root.find(‘tag’): Returns the first Element object with the name given for the tag or None if the tag name is not found. 
    2. root.findall(‘tag’): Returns a list of all Element objects with the given tag name or an empty list if the name is not found

For example, if we wanted to find the first “book” tag in our XML example from above, we could use this line of code:

 

                    

                    Printing catalogElement gives us this:

                    

                    If we wanted to get a list of all the book tags, we could use this line:

                   

                  Printing catalogList gives us a list of the “book” tags in the XML file: 

                  

        5. Notice how each book element doesn’t just print by itself: it prints with a lot of other text. We can get rid of this “extra” text by iterating through each element in catalogList and using .text:

                  

                  Notice that we use a child element of the book element (title, author, genre, etc. from our XML file) inside our find() method.   

                  This outputs the titles of all the book tags (shown below).

                  

        6. We can also add each of these titles to a new list to create a list of all the book titles in our XML file:

                 

                 This gives us a list with all the titles of books in the XML file. 

                 

                 We can similarly do this with any child element of the book tag, such as the genre, author, or price of the book. 

      Full Code: