How to Pretty Print XML with lxml

by Ruslan Spivak on May 12, 2014

If you use lxml library and need to pretty print XML, here is a snippet that works even if you have existing indentation in your XML.

#!/usr/bin/env python
import sys
import StringIO

from lxml import etree

def main():
    xml_text = sys.stdin.read()
    parser = etree.XMLParser(remove_blank_text=True)
    file_obj = StringIO.StringIO(xml_text)
    tree = etree.parse(file_obj, parser)
    print(etree.tostring(tree, pretty_print=True))

if __name__ == '__main__':
    main()

Save the above code to a file ppxml and make it executable.
After that you could use it on the command line like this, for example:

$ cat << EOF | ppxml
> <root>
> <child1/>
> <child2/>
>        <child3/>
>   </root>
> EOF
<root>
  <child1/>
  <child2/>
  <child3/>
</root>
If you enjoyed this post why not subscribe via email or my RSS feed and get the latest updates immediately. You can also follow me on GitHub or Twitter.

Speak your mind

Previous post: