Forgot your password?
typodupeerror

Comment Re:It's about tools, libraries (Score 5, Informative) 608

You have used many big words, and you may have your language levels incorrect, but you are clearly wrong in one respect:

Generic XML parsers are memory intensive and can't be as fast as regular expressions. That's just computer science. Deal with it.


Well, I've written my own XML parser, as well as a compiler for a simplified version of C, so I think I'm somewhat qualified to talk on this. A generalized XML parser is not memory intensive, unless you are a very bad programmer. All you need is a depth-first stack, which will be as high as your XML tree is deep. And given that, a stack of size N is capable of handling a tree of size X^N, you are definitely going to run out of disk space before you run out of RAM. In other words, the memory required for parsing an XML tree is trivial.

An XML parser is one of the simplest parsers imaginable. It's a sophmore task to create a state machine to process the generic L(1) (or is it L(0)?) XML grammar. And as you should know, a state machine for an L(1) grammar is as fast as you can get.

Anything you do with regular expressions will be much more complicated. As I'm sure you know, regular expressions are turned into state machines before being used to process the input. And almost all regular expression state machines are much more complicated than the state machine you need for an XML parser. In an XML parser, definite boundaries exist on elements such as:
'<' and '>'


Regular expressions are not this smart. For example, looking for the substring "abc" in the longer string "abababaaabbbabcabababac" is already generating a statemachine that is more complicated than that needed for XML parsers.

Back to the "memory" intensive nature of XML parsers. If you parse your XML tree into a nested hashmap structure, then the memory needed will be proportional to the number of nodes in the XML tree. Maybe this is what you meant by "memory intensive". However, this is totally unnecessary. You can easily construct an XML parser to look for the specific elements you care about. Then you only get those elements, and you only need to allocate the memory for the elements required.

Slashdot Top Deals

The first sign of maturity is the discovery that the volume knob also turns to the left.

Working...