DOM Vs. SAX: Only One Will Survive
So while working on iPhone stuff, we used a SAX parser to handle xml. I thought this was a wacky way of parsing stuff, and I prefer DOM since it is much easier to understand as it’s conveniently in tree form.
Today, I tried parsing a 39mb xml file with Python’s minidom. Bad idea. It’s currently making my MacBook choke and it also made my desktop PC cry for memory. Apparently SAX is 1337 when it comes to memory efficiency while DOM is incredibly inefficient. Seriously, Python was taking more than 1gb of memory to parse a 39mb file! Perhaps my actual use of the minidom stuff was incorrect, but either way, I’ll probably try doing stuff in SAX if the file happens to be even remotely big.
Lesson learned!
Update: SAX totally pwned those large files. All 8 of them! And in less than a minute!