How to Pretty Print XML with lxml

by Ruslan Spivak on May 12, 2014

If you use lxml library and need to pretty print XML, here is a snippet that works even if you have existing indentation in your XML.

#!/usr/bin/env python
import sys
import StringIO

from lxml import etree

def main():
    xml_text = sys.stdin.read()
    parser = etree.XMLParser(remove_blank_text=True)
    file_obj = StringIO.StringIO(xml_text)
    tree = etree.parse(file_obj, parser)
    print(etree.tostring(tree, pretty_print=True))

if __name__ == '__main__':
    main()

Save the above code to a file ppxml and make it executable.
After that you could use it on the command line like this, for example:

$ cat << EOF | ppxml
> <root>
> <child1/>
> <child2/>
>        <child3/>
>   </root>
> EOF
<root>
  <child1/>
  <child2/>
  <child3/>
</root>
If you enjoyed this post why not subscribe via email or my RSS feed and get the latest updates immediately. You can also follow me on GitHub or Twitter.

{ 1 comment… read it below or add one }

Igor Kupczynski October 9, 2014 at 7:48 AM

I like this command to pretty print json:
$ python -m json.tool

E.g. with elasticsearch
$ curl localhost:9200/_search -d ‘{“query”: {“match_all”: {}}}’ | python -m json.tool

Reply

Speak your mind

Previous post: