Want to read Slashdot from your mobile device? Point it at m.slashdot.org and keep reading!

 



Forgot your password?
typodupeerror
×

No Nonsense XML Web Development with PHP 131

Alex Moskalyuk writes "PHP and XML seems like a marriage made in heaven. Powerful manipulation functions and support on the core language level in PHP5 combined with universal extensibility of XML make it a technology of choice for quite a few Web enthusiasts and companies out there. However, anyone inspired by PHP's ease of use can probably find a good cure from insomnia when facing with XML specs. With all the DTD's, XML Schemas, XSLT and XPath queries one can easily get the impression that the world is changing on them, and perhaps sticking to hard-coded HTML with PHP statements combined with SQL statements for data retrieval would be within the zone of comfort." Read the rest of Alex's review.
No Nonsense XML Web Development with PHP
author Thomas Myer
pages 354
publisher SitePoint
rating 9/10
reviewer Alex Moskalyuk
ISBN 097524020X
summary XML, XSLT, XPath and DOM primer for PHP developers


Thomas Myer's No Nonsense XML Web Development with PHP is an XML primer for those who have been exposed to PHP, but are yet waiting to appreciate the elegance of PHP+XML solutions. Throughout 10 chapters and 2 appendices Myer is introducing the reader to different aspects of XML, their best-practice implementations in LAMP (where last P stands for PHP) environment, and their relevance to the real world. For the real-world example Myer decides to guide the reader through writing a custom content management system - complete with publishing/admin interface, templating/presentation layer, search engine, RSS feeds and other commonly expected features.

The book is not an introduction to PHP, but it does assume that the Web developer knows what XML is, but has never dealt with it. So the first chapter just talks about properly parsing XML with IE and Firefox, validating an XML document, differences between a well-formed and a valid XML document. Overall, it provides a very good introduction to XML for someone who has never dealt with it, and could probably be skipped by developers with XML exposure.

Chapter 2, XML in Practice, goes into nitty-gritty details of XML, and 26 pages later the reader knows how to create an XML file to display in the browser, declare proper namespaces, attach a CSS file to existing XML file and display the proper XML+CSS file (look, Ma, no <html>!) in the browser. The author earns instant geek credibility by displaying Firefox screenshots, with the exception of IE screenshot whenever IE is discussed. At the end of the chapter the author takes us through the basic XSLT.

DTD's, XSLT and writing a practical PHP app take up the next three chapters, followed by XML manipulation chapters. JavaScript enthusiasts will probably find Chapter 6 pretty useful, as it discusses manipulating XML on the client side, working with XSLT, and creating dynamic site navigation based on the XML source. Chapter 7 is what one would expect from the book that has the words PHP and XML in the title - discussion of SAX, DOM and SimpleXML parsers, examples of their implementation, discussion of proper use cases for each one of the technologies. The SimpleXML subchapter also contains a good primer on XPath - a query language that allows the developer to provide the parser with a query to navigate down the XML document.

Chapter 8 takes the reader through RDF and RSS, discusses the ways the syndication feeds are used on the Web nowadays. Since throughout all these chapters we're building a content management system, this is the right time to add the RSS headlines functionality to the site. The next chapter discusses another practical implementation of XML on the Web - XML-RPC calls between the sites and proper ways of exchanging data via XML Web services. The chapter discusses SOAP, although not a whole lot, and just mentions REST as another way to implement Web Services. As a practical exercise, the author takes readers on a tour of building an XML-RPC client, server and connecting those two together.

The last chapter talks about using XML with databases. Native XML databases are discussed, but let's face it - most of the PHP development is done with relational databases anyway. Myer talks about exporting MySQL database contents into XML with phpMyAdmin and mysqldump. The first appendix includes function reference for SAX, DOM and SimpleXML parsing in PHP, while the second one completes the CMS project by providing the rest of the necessary files.

I found the author's style very easy to follow and approachable. The code samples are succinct and to the point, there are also no generic discussions, such as "Why PHP?" The project chosen for the practical implementation is a bit boring, but at the same time quite real-world. The screenshots are clear, and code examples are nicely highlighted. The errata is provided on the book Web site. Code archive is available as a single file download as well. The book site also provides 100% money back guarantee (less shipping and handling fees) to anyone who bought the title, and didn't feel like they were getting their money's worth.

However, there are a few drawbacks that I noticed as well. With topics like XSLT and XPath broken into several chapters and discussed in smaller chunks, it's hard to use the book as a reference later on. Appendix A with PHP function reference for XML parsing hardly seems like a worthy addition, since PHP manual page on the subject contains equivalent information with more real-life examples contributed by users.

With all that, the book is quite informative, educational and useful. The author manages to tackle quite a few difficult topics in 260 pages provided to him (the count excludes preface and appendices). However, kudos to the author for writing chapters on XML without sounding boring, redundant or too academic. I would highly recommend this book to anyone interested in developing PHP-driven Web sites that provide or consume Web services, work with XML data or generate XML for others to use."


You can purchase No Nonsense XML Web Development with PHP from bn.com. Slashdot welcomes readers' book reviews -- to see your own review here, read the book review guidelines, then visit the submission page.
This discussion has been archived. No new comments can be posted.

No Nonsense XML Web Development with PHP

Comments Filter:
  • by Orrin Bloquy ( 898571 ) on Wednesday March 15, 2006 @01:54PM (#14925659) Journal
    One thing I will concede to PHP is that you tend to be more likely to have XSLT engines installed on a PHP based system, whereas I had to cajole my sysadmin into getting the C-based transformation libraries installed and then locally install the dependent Perl libraries to use it on top of that. In the end the Perl/XSLT solution I created works, but it wasn't fun to install.
  • by markmcb ( 855750 ) on Wednesday March 15, 2006 @02:01PM (#14925719) Homepage
    I authored the site OmniNerd [omninerd.com]. When I first started writing code, I made a point of storing data either in a database or XML, translating data to XHTML with XSLT, using CSS for all style issues, and controlling everything with PHP. What I struggled with for over a year was the XML/XSLT portion of the site. I was constantly having to jump through all sorts of hoops to get things done that could easily be handled with just PHP and a database.

    This isn't intended to be me bashing XML/XSLT, but more of a warning. If you plan to use these two, ensure you fully understand them and how they will tie into your site. I've found with OmniNerd that XML/XSLT solutions are very nice for the more static or semi-static content and that using PHP to generate XHTML directly from the database is better suited for dynamic content.

    Whatever you choose to use though, good luck!
  • Re:wut (Score:3, Insightful)

    by Danse ( 1026 ) on Wednesday March 15, 2006 @02:03PM (#14925741)

    XML stands for Xtremely Media-hyped Language and PHP stands for Perl-Hater's Platform. They are both very overused and should be ignored from this point on. Oh crap. I guess I get a free downmod for going against Slashdot culture. Oh well.

    No, you should get the downmod for posting a moronic comment that contains flamebait only with no facts or even anecdotes to back it up. You rightly deserve at least a -3 for such a comment.

  • A compromise? (Score:5, Insightful)

    by MasterC ( 70492 ) <cmlburnett@gm[ ].com ['ail' in gap]> on Wednesday March 15, 2006 @02:11PM (#14925819) Homepage
    Since I started using PHP's DOM functions, I haven't written a lick of hard coded HTML except for templates that I import into DOM. I create template tags within the template as hook points so on loading the template into DOM I can cache a list of all these template hooks (and remove them so the template is back to valid HTML) and then I can inject my dynamic content directly into where the hooks are.

    Some quick advantages:
    • You don't have to worry about closing your tags, just assigning parents
    • You can modify your tree at any point in execution (such as style changes, removing sections of the page based on user input, etc.)
    • Outputting HTML or XHTML doesn't change your DOM tree
    • You can more easily write code with more separation between functionality (model) and interface (view)
    • If an error occurs then you don't have to worry about the "headers already sent" issue
    • You can easily create DOM manipulation libraries to do a lot of the tedious tasks for you (element creation, attribute population, etc.)

    So even if you don't want to get into XML, XSLT, etc. then using the DOM for page generation is a much better solution than the traditional mixing HTML into PHP into files. The only qualifier to that I can think of is very small sites and when you don't have said libraries and such built up.

    When else would hard coding HTML be preferred? I'm drawing a complete blank.
  • by G)-(ostly ( 960826 ) on Wednesday March 15, 2006 @02:13PM (#14925844) Journal
    XML is not for "storing data". I can't believe people still find that confusing in this day and age. XML is for describing data. It's little more than a loosely built, glorified file format. It serves no more purpose to data than tabs seperating "columns" in a text file do.

    XML is good for transferring data between systems. It is not good for storing data, which is what databases are for, or presenting data, which is what applications are for.
  • by fm6 ( 162816 ) on Wednesday March 15, 2006 @02:17PM (#14925880) Homepage Journal
    That's a very good analysis. I'm a strong XML/XSLT advocate, but only because I work with the kind of documents that need them: big nasty technical manuals and guides that have a lot of complicated structure, are always be updated, and have to be delivered in multiple formats. When someone challenges by XML dogma, they always point to some project they've worked on that would have been much harder if they'd had to use XML. Most of the time (not always!) they're right, usually because the particular project is a one-shot document that will see little or no revision. Of course, that just says that XML is useless to them.

    XML is a key technology, and much underused by my profession, which still relies too much on FrameMaker, Word, and (God help us!) plain old HTML. But it's not the solution to every content management problem.

  • Re:A compromise? (Score:5, Insightful)

    by Bogtha ( 906264 ) on Wednesday March 15, 2006 @02:22PM (#14925915)

    When else would hard coding HTML be preferred?

    The downside to using the DOM as you describe is that you need to generate the whole document before you start sending it. For example, imagine if Slashdot used your approach - on a page with hundreds of comments, you'd have to wait for every last comment to be added to the DOM before you even started to send the headline to the user.

  • Re:wut (Score:3, Insightful)

    by Anonymous Coward on Wednesday March 15, 2006 @02:24PM (#14925935)
    XML is absolutely not all it's hyped up to be.

    That said, as any Lisp programmer will tell you, tree-structured data is a Good Thing(TM). There's a reason why reading in input like:
    Mar 15 12:32:31 localhost dhclient: DHCPREQUEST on eth0 to 192.168.5.5 port 67
    is complicated and fragile, whereas reading in input like:
    (logentry (date (month Mar) (day 15) (time 12:32:31)) (host localhost) (sender dhclient) (message "DHCPREQUEST on eth0 to 192.168.5.5 port 67"))
    is so trivial that, well, I just typed this into DrScheme:
    (define logdata (read))
    and copy-pasted the second one into the input box, and DrScheme understood it perfectly.

    Regexps are basically a hack to deal with data, like the first log file (which is what it actually looks like on my system), where the structure has been compressed/eliminated. In a perfect world, everything would be tree-structured, and none of those hacks would be necessary.

    But wait... that's XML!
    <logentry><date><month>Mar</month> <day>15</day> <time>12:32:31</time></date> <host>localhost</host> <sender>dhclient</sender> <message>"DHCPREQUEST on eth0 to 192.168.5.5 port 67"</message></logentry>
    It's harder to read than the parenthetical version, and slightly harder to parse (especially if there are attributes inside the XML tags), but the two are basically equivalent.

    In Scheme, at least, you can build a generic XML-to-s-expression parser that will allow you to deal with any XML data that comes at you as easily as if it were parenthetical. And by generic, I mean that it can deal with any (well-formed) XML data ever. By contrast, regexps are fragile by definition. Even splitting along whitespace isn't always safe.

    As far as PHP goes, I couldn't care less... it's both slower and less flexible than Scheme. What a combo! (Of course, Perl is too... ;)
  • by markmcb ( 855750 ) on Wednesday March 15, 2006 @02:29PM (#14925980) Homepage
    XML is not for "storing data".

    Well, in the classroom you may be correct, but when you're looking for solutions, XML is often times a better place to store static data than a database. A perfect example is on OmniNerd, when one of our articles gets Slashdotted, or we think it's going to be, we bypass the database and create a static copy of our article in XML. It's faster since no "thought" is required to query specific data as it's all just there. The results have been that our server doesn't flinch when the massive wave of HTTP requests hit our site.

    I also use it to store data for parts of the site that remain static. Why insert my FAQ into my database if it's not structured in a dynamic manner? It's far easier for me to go edit an XML file than run a bunch of queries, and we already mentioned the removed burden from the database.

    Consider the alternative of storing it in an XHTML file. If I change the style of my site, then I have to update the XHTML file too as it's static. I can quickly translate the XML via XSLT with PHP, ASP, etc. There's no need to touch the data when I make a structural change. So given the static nature not requiring a database, the desire for easy updates, and the need to remove data from structure, I still choose XML.

    So, yes, from a purist perspective it's for describing data. But from the perspective of someone trying to run a functional and effective site, it can be useful for storing certain data as well.
  • by G)-(ostly ( 960826 ) on Wednesday March 15, 2006 @02:48PM (#14926161) Journal
    You're not storing the data "in XML", you're storing it on the filesystem in files that describe the data via XML. The performance benefit of the static data over the RDBMS data store is provided by the filesystem, not as a function of XML. To the contrary, your retrieval of the data is actually hindered by the XML because it increases the size of the files that must be retreived and transferred.
  • Re:A compromise? (Score:1, Insightful)

    by Anonymous Coward on Wednesday March 15, 2006 @02:54PM (#14926225)
    You can modify your tree at any point in execution (such as style changes, removing sections of the page based on user input, etc.)

    This is not good from a programming point of view, OK the presentation and business logic are kind of seperated, but you seem to be encouraging the business logic to modify the template... This is very bad design.

    What if you need to output a non-XML format like CSV or PDF? XSLT is dead slow and shouldnt be used for live transformation (IMHO)
  • by KZigurs ( 638781 ) on Wednesday March 15, 2006 @03:42PM (#14926639)
    Reccomended headlines for next PHP fanboys posts:
    - PHP saves granny from death
    - How to build nuclear reactor using PHP and components from radioshack
    - Reliable extraterrestial exploration using php.net functions reference comments
    - PHP programmer cured from cancer, aids and herpes (aciquired while trying to understand any basic computer science topic)...
    - PHP Saves! Better than Jesus!
    - PHP - a quick guide to shopping.

    C'mon nerds - trying to manipulate XML with common PHP functions is like trying to hang a picture on your wall using McDonalds fries, average sized elephant and twenty years old issue of playboy magazine. Ok, I have no problems to use PHP for what it's intented - quick, dirty and unmaintainable html generators occasionally attempting to simulate functionality of even the most basic OO languages, but please - everything has its limits.

    P.S. I occasionally do XML for living. And XSLs are simple. :P
  • Joy and Sorrow (Score:5, Insightful)

    by Tom ( 822 ) on Wednesday March 15, 2006 @07:27PM (#14928740) Homepage Journal
    I can't imagine two languages less suited for mixing than PHP and XML.

    PHP is losely typed, full of hacks (excellent hacks that make coding easier) and is great exactly because it allows the coder to be pretty careless and have the language look out for him as far as possible.

    XML, on the other hand, is strict and harsh on the coder. Forgot to close a tag? Wrong character somewhere? Not got the tag order correct? Sorry, your entire tree fails parsing.

    They just don't mix well, and it shows everywhere. I'm currently coding a PHP app using XML-RPC, and gosh is it convoluted. You've gotta cast practically everything into the special XML-RPC values and back out again. You'd expect the libraries to have functions doing that for you, but you'd be mistaken. On the average line stuffing together an XML-RPC call, the whole "new XML_RPC_VALUE" stuff takes up twice the space of the actual variables.

    Doesn't mix well. Sorry, I like PHP a lot and XML is an excellent thing. But they just don't mix well.

Reality must take precedence over public relations, for Mother Nature cannot be fooled. -- R.P. Feynman

Working...