Friday, January 20, 2006

Python XML implementation flaw -- whitespace and wholeText


I place this non-philosophical note into the blogosphere in the hope that Google will pick it up and do society some good.

If you are useing the xml.dom.minidom implementation, and are dealing with minidom.Text nodes, the following stupid behaviour occurs:

When you call node.replaceWholeText in order to, well, replace the text contained inside the node, don't try to put an empty string in. I was formatting the text content of nodes, and was using strip() to remove whitespace from around the edges. Unfortunately, calling strip() on a whitespace string results in the empty string.

The practical effect was major breakage. With no errors, warnings, or readily apparent cause, the entire XML document became corrupted. Not only corrupted, but empty. My entire XML document was Fubar. Fortunately, this was version 0, so I wasn't saving my changes back to disk, so I didn't lose any information permanently, but I lost a full day (arrive 10am, swear constantly, find solution at 4pm) on this stupid, undocumented, and brain-bending behaviour.

My dummy is spat.



Post a Comment

Links to this post:

Create a Link

<< Home