Generating XML

You should generate XML programmatically whenever possible to help ensure well-formedness. In Python, the xmlprinter package lets us do this. You can get xmlprinter.py from Sourceforge. The simplest way to use it is to untar and copy xmlprinter.py to the directory with your Python script.

xmlprinter is very easy to use. An example:


import xmlprinter

output = open("output.xml","w")
writer = xmlprinter.xmlprinter(output)

writer.startDocument()
writer.startElement("greeting", 
                  { "class": "simple" })
writer.data("Hello, world!")
writer.endElement("greeting")
writer.endDocument()
output.close()

Let's go through the example line by line:

import xmlprinter
This just lets python know we want to use the xmlprinter package.
output = open("output.xml","w")
writer = xmlprinter.xmlprinter(output)
This creates a new xmlprinter object with output directed to the filehandle output. To write to standard output, just use sys.stdout. To write to a string, we can use the StringIO package:
import xmlprinter
import StringIO
output = StringIO.StringIO()
writer = xmlprinter.xmlprinter(output)

# ... do some stuff with writer ...

print fp.getvalue()
Going back to our example, we next insert an an XML declaration:
writer.startDocument()
The declaration that writer.startDocument() emits lets us know that the output is XML and uses UTF-8 as the character set. xmlprinter only emits UTF-8 right now, but you could always use ICU to change the encoding afterwards. We'll talk more about character sets and encodings later. The output from this is
<?xml version="1.0" encoding="UTF-8"?>
Then we open a <greeting> tag with the attribute class = "simple", write the characters Hello, world! and finally close the <greeting> tag.
writer.startElement("greeting", 
                  { "class": "simple" })
writer.data("Hello, world!")
writer.endElement("greeting")
This produces
<greeting class="simple">Hello, world!</greeting>

Then we clean up by telling writer we're done writing so that it can make sure we didn't leave any elements open and by closing our output filehandle.

writer.endDocument()
output.close()