Want to read Slashdot from your mobile device? Point it at m.slashdot.org and keep reading!


Forgot your password?
The Internet

XML and Transcoding - How Would You Do It? 139

morzel asks a doosy: "XML is one of these words everybody's talking about yet no-one really knows how to use it in specific applications or server technologies. At the Apache XML Project, some work is being done on integrating XML/XSL in the server itself, but personally I like IBM's idea of a transcoder in between a range of (XML) servers and a range of clients. But... how can it be done?" (More)

"Suppose you have to develop an on-line application, and you'd want to go with XML on the server side, and everyday browsers on the client side. Portable platforms like Palm and WAP-enabled phones will probably be a client platform that is being used frequently.
What tools -open source or commercial- are available to accomplish this?

The elements of the system are:

  • XML Enabled Database system: Data is retrieved by the transcoder using HTTP or your favorite protocol
  • Transcoding gateway: should translate the XML data using XSL (or another way) to a form readable by the client. The exact translation or the XSL to use can be set by the server (included in the XML source), or be detected by the gateway.
  • Browsers of all colours and kinds.
A typical usage of this system would be the publishing of an on-line application without having to bother with client troubles except for writing the XSLs. I do web development, and the amount of work that goes into making sure every platform works as it's supposed to be is way too much in comparison to the functionality of the system. Specially when exotic clients like PDAs and WAP mobile phones are requested client platforms (e.g.: a sales follow-up app), the burden of getting everything working and having a UI that does the job is a real nightmare...

XML is the wave of the future, that's for sure... But what tools are available to actually incorporate XML in a system that can do all things we poor webdesigners dream of?

All suggestions welcome! "

This discussion has been archived. No new comments can be posted.

XML and Transcoding - How Would You Do It?

Comments Filter:
  • by Anonymous Coward
    XML is a technology that was created by european socialists, thus it is por-open source and supportive of the communist GPL. IMHO this represents a great leap forward in our evolution as a social entity. The creators of XML need to hold to their principles and GPL XML. We need to go to the buildings where they work and repeat our glorious mantra to make the world good. We need to (as was put so nicely in a Microsoft Internet Eplorer ad) join hands and sing songs about rainbows and free software.
  • by pb ( 1020 ) on Sunday January 16, 2000 @08:07PM (#1365699)
    Lisp has been doing this stuff forever. Maybe it'd be a good idea to look into the formats that expert systems [umbc.edu] use to exchange data; I bet they're pretty generic.

    Of course, that won't happen, we'll all make our own stripped-down, human-readable versions, with big gaping flaws, until someone either standardizes it, or hides something nasty and binary with a GUI and dominates the market (*hint* I wonder who wants to use XML and "open standards"....) So let's try to come up with a real open format now, instead. :)
    pb Reply or e-mail; don't vaguely moderate [].
  • by ChipX86 ( 102440 ) on Sunday January 16, 2000 @08:14PM (#1365700) Homepage
    Well, this is kind of a shamless plug, but I'm developing a XML parser at http://mino.portaldesign.net [portaldesign.net]. It is LGPL. The library can be used in any programs and the parser that comes with it can be used for converting XML files to HTML on-the-fly.

    I'm working on XSL support (so people can easily say what XML tags should become in HTML), so that should be done in the (hopefully) near future. For now, feel free to download the latest alpha and play with it.

    In the near future, I plan to have support for databases, CSS, XSL (as mentioned above), and a few other XML-related technologies.

    People familiar with C/C++ should easily be able to write custom modules for converting from XML to HTML using the library by looking at the examples in xmlhandlers/. Anyone want to help develop this?
  • Probably one of the few truely great ideas in the Web development industry. It means freedom from client peculiarities --forget about all writing for all those different browsers again and again, just one huge translator template will (e.g. XML->Opera-compatible HTML, or IE-compatible HTML or AvantGO, etc). It means that potentially the same server can be serving not only PCs, laptops, PDAs and the like, but also other software, by reading plain XML, or some subset of it.

    In the OSS arena, the best example of XML on the server=>HTML (or for that matter anything else) on the client is Cocoon [apache.org]. I played around with Cocoon 1.x a little bit and it's very impressive architecturally, but even the principals agree that the performance isn't there yet. I am eagerly awaiting for Coccoon 2 though ;-)...

    engineers never lie; we just approximate the truth.
  • One thing that I heard the wonderful-world of XML was supposed to allow was data on demand. A user clicks an XML/XSL defined element such as a button or piece of hypertext and the page updates without reloading.

    This was the theory anyway...has anybody heard of such an implementation, or does anybody know if it is in a future spec?

    One application (which is badly needed on the web, I think) is a dynamic collapsable tree. Imagine if you will a SlashDot comments page (not to hard, as you are looking at one!). Now, instead of getting a page-full of comments that take a healthy amount of time downloading (depending on your threshold settings): imagine clicking on a message to expand more comments in the thread which are fetched dynamically. You could resort, change moderation thresholds, and lots of other nifty dynamic operations without having the server do all the work.

  • by Gleef ( 86 ) on Sunday January 16, 2000 @08:34PM (#1365706) Homepage
    Ideally, browsers should develop to the point where they understand XML as well as HTML and XSL as well as CSS. There has been significant effort to do this in the Mozilla browser, the XML/CSS combo works quite well, and the person developing an XSLT (XSL Transformations) engine for Mozilla is talking about having something useful around May. Similarly, Internet Explorer 5.0 has a base understanding of XML (styled with CSS), and surely plugins for decent XML/XSL encoding for IE are likely to appear soon after Netscape shows that it's a feature people demand.

    In the meantime, there are some Java Servlets out there to do the transformation on the server side. The server will grab the XML and XSL file, do transformations, and output HTML (or whatever format) to the client. I haven't played with them enough to recommend one as being particularly better, but there's some handy stuff out there.

  • The reason we use XML in our multi-tier solution is simple. ADO cannot support detached, hierachical record sets.

    In our case, this meant we had to find a way to store that hierachical information, which is vital to the front end, in an intermediate format that did not put load on the database itself.

    The reason for that, of course, is that when you're running a distributed application to potentially thousands of clients, you want any database hit to be as few, fast and clean as possible.
    That means we can't sustain connections to the DB.
    That means we have to use disconnected record sets.
    Disconnected recordsets don't hold hierachy information, and that means that we have find some other way of hitting the database once, getting enough data to build the hierachy externally, then shutting down the DB link.

    XML provides the functionality we need to parse a flat recordset back up to a hierachical structure, without hitting the database again. It also has the added bonus that when it comes to presenting the front end in a browser, we can feed it directly to the browser if it's "XML compliant" (IE5, though there is a patch for IE4 [microsoft.com]).


    PS: You'll also find that XSL can do similar things to your XML as CSS does to HTML ;)
  • XML is a killer technology in business to business / server to server communications. Example? I'm involved in a project designing a website selling widgets and widget service plans. (The product name has been changed to protect the innocent.) We use XML to:
    • communicate with the order fulfilment organization to check on inventory
    • gather shipping information from the fulfilment organization
    • submit order requests to the fulfilment organization

    The widget order fulfilment organization has a server that speaks XML over HTTP. We created a widget on our server to talk XML over HTTP to it. Instead of spending weeks to work out how to communicate with some proprietery server in proietary format we spent a few days interfacing our servers.

    XML = server to server / business to business killer technology

    The consumer may someday directly use XML but I don't see that coming soon on a broad scale. HTML (with Java, Javascript, CSS, etc.) will (IMHO) be the way consumers work the web for the near future.

    Of course, I could be wrong.

  • by jkorty ( 86242 ) on Sunday January 16, 2000 @08:43PM (#1365709) Homepage
    The XML FAQ is here [www.ucc.ie].
  • by Anonymous Coward
    The XML part of IBM's transcoding scheme and the planned developments at xml.apache.org are already present in the ExterXML Server from XMLSolutions (www.xmls.com). It uses cocoon (you don't have to wait for Cocoon 2, version 1.5 is pretty fast). You can specify in your document which XSL stylesheet to use for each browser. The IBM Transcoding stuff looks interesting for HTML, but for XML the transcoding solution from IBM is basically XSL.
  • I would advise against anyone using XSL.

    Looking at any non-trivial XSL stylesheets, you can see what a generally bad idea it is.

    My advice would be to use a real programming language with DOM bindings.

    XML.com has a good article regarding XSL:XSL considered hamrful. [xml.com]

    Note that XML.com also has some pro-XSL articles listed, but they aren't nearly as persuasive.

    The bottom line is that the W3 "ordained" XSL to be part of the grand scheme of things, although the technology hasn't been developed in response to any particular problem.

  • The question is presented in a somewhat muddled manner, but if I understand correctly, it has to do with converting from XML to various formats. For the record, I don't think this is really an issue of converting from XML (which is relatively easy, given good DTDs and [for human eyes] XSL). The beauty of XML and XSL is that it's supposed to separate the *data* from the *presentation* of the data (unlike this mess we call HTML).

    So then, if you intend to use XML to store the data, and XSL to format it, the only part of the equation left is determining which stylesheet applies to which requesting client. I have no experience with XSL (I use XML for machine data, not for human eyes) -- is it possible to determine in the document which stylesheet to use? If so, it's just a matter of writing all the stylesheets.

    Of course, this all depends on everyone understanding XML and XSL. If people insist on using legacy clients (like non-XML compliant web-browsers *cough*Netscape*cough*), then is a need for "transcoders" to do the XML/XSL interpretation and spit out HTML (|| HDML || whatever) that works with that client.

    P.S. If you want applications of XML, look in the b2b e-commerce world. I'll avoid the direct plug and not name the company I work for, but the whole industry is based on XML.
  • For a data transfer layer between fully automated intelligent agent systems distrubted world wide over high end computing clusters..

    just in case you handt heard...
  • -- You summarise the readability advantages of XML very well. BUT XML will simply not work (easily) if you have a multiple-user, record-locking, concurrency-handling, read-WRITE database. All the best database apps I know allow you to update data which you have already read and /or insert new data. This could get VERY messy with XML unless you hab a RDBMS at the backend that implemented Transactions and an efficient locking scheme (it would certainly have to scale to the moon). Moreover, because there could be an appreciable delay between updates, this RDBMS would need to store a lot of 'before' images i.e. copies of the data used to created XML resultsets. This rules out all but the beefiest RDBMSes. Thus to implement read-write XML based interfaces to a database, you are going to have to splash out on a serious RDBMS. From my experience, Oracle 8 could handle the transactions (although 8i might be better with the native JVM). Would MS SQL Server be scalable (vis a vi record locks)? Maybe not. P.S. you can forget about MySQL.

  • Believe it or not, the open-source bug has biten M$ !

    Look into M$'s sponsorship of the Schools Interoperability Framework (www.schoolsinterop.org) and maybe you can see how M$ plans to use XML (and its derivative) in real world application.

  • by trance9 ( 10504 )
    The key insight into XML is that it should be used only where other solutions fall apart. XML is one of those technologies that is so general, so abstract, and so powerful that you can construct a solution for ANY problem.

    The downside is that the solution will involve extra processing steps, extra stuff to be implemented, and impose on you a development model that might not always be convenient (not everything wants to be a document, or a conversion or transcoding between document formats).

    However, there are many cases where XML is the only viable solution, and in those cases you're just glat you can solve the problem at all! A typical example is when you have documents coming from multiple sources, and you publish them to multiple targets. It's easy to see what the XML solution would look like--but the problem doesn't even fit into the other ways of doing things.

    With WebMacro [webmacro.org] a common implementation strategy is to drop key XML objects into a template that is otherwise created through ordinary WebMacro HTML template gunk.

    The advantage of this approach is that you can create the bread-and-butter stuff like shopping carts, authentication, login/logout, using ordinary Java servlet code and templates. (These things are nasty when you try and force them into a document model).

    Then in the middle of your page somewhere you have your XML document, rendered using XSLT or something. You have other targets, besides your servlet, where you publish that same XML document, so the whole thing winds up being a rather pleasant mixture of two different programming paradigms.

    Again, the key insight in this strategy is that you use XML for the parts of your problem where it is the only viable solution--and you do everything else the normal way (without the extra costs imposed by XML, since you don't need the extra power).

    I worked in an SGML shop for a couple of years, and became smitten with SGML/XML. I set out to do absolutely everything I could in SGML/XML for awhile, before realizing that a traditional template tool (like WebMacro [webmacro.org]) was far more useful for typical bread and butter servlet programming.

    I still use XML a lot, but now I use it intelligently, where it's needed!
  • I am working on a project which may accomplish what most of this discusses. I am looking for people to help with specific implementation issues such as setting up autoconf, server programming etc. More info can be found at XMLTP.Org [xmltp.org]. Cheers, Gavin
  • Ideally, browsers should develop to the point where they understand XML as well as HTML and XSL as well as CSS.
    That's the ideal situation...

    The current situation however is that there is a plethora of browsers, which is growing rapidly, with big differences among them, between OSes and even between versions...
    To develop a number of websites, one can simply not assume that users will have a specific browsers on a specific OS of a specific version...

    We might evolve to a pure XML/XSL/CSS browser eventually, but until then, there has to be a different solution that can serve today... You would be amazed how much people still use Netscape 3, just because they don't have the urge to upgrade...

    Java servlets are a technique, but again: it's built into the server. There are a number of servers out there, that don't have these servlets, so that another solution would come in reaallllyyyy handy.

    Okay... I'll do the stupid things first, then you shy people follow.

  • Hmm, in this case, XML is purely used as a transfer agent, not to hold concurrent data, directly effect writes upon the database.
    I agree that that would be very very ugly, but then, I also don't think that a system should necessarily be trying to provide concurrency on the client side, especially if the client base is expected to be extensive.
    In this case, as you say, record locking and concurrency handling problems would all but preclude the use of anything but the most 'beefy' RDBMS's.

    In my case, perhaps I am lucky in that user interaction is not 'live', but transactional. I just present some output, and wait for the user to respond in whatever way. Once that response comes in, I have a heap of middle-tier business logic handling exactly what we should do with it.
    Record locking and such issues are dealt with at that level, rather than in the backend.

    And yes, I do believe that SQL Server could handle such a solution, coupled with MTS and perhaps using a little DCOM :P

    In any case, transactions can do nothing but help the cause :P
  • The downside is that the solution will involve extra processing steps, extra stuff to be implemented, and impose on you a development model that might not always be convenient (not everything wants to be a document, or a conversion or transcoding between document formats).
    At a certain point, the trouble to do everything in HTML for 15 different browser versions, including Psion, Palm & WAP-enabled phones, the sheer amount of work of doing that and keeping things updated - this is not a static website, but an on-line application - is so overwhelming that unless the application is extraordinary complex, more manpower is being put into the getting the UI in every supported browser right. This energy should be going into the application and its functionality.

    There is indeed a lot of work to be done on a framework from which one can develop these applications, but its components should be recycled very easily, especially when your company's main business is developing on-line applications ;-)

    Okay... I'll do the stupid things first, then you shy people follow.

  • AFAIK that's a combination of XML + CSS + DOM + Javascript. Basically you use Javascript to access the DOM (document object model) version of the XML parse tree and either rewrite it, or change it's CSS representation on the fly.
  • by __donald_ball__ ( 136977 ) on Sunday January 16, 2000 @10:19PM (#1365727) Homepage
    Hiya. I'm one of the authors on the cocoon project and I admit my biases upfront. I think, and many of you seem to agree, that the web publishing industry (more generally, the electronic information publishing industry) is in desperate need of a standard way of seperating (and mixing) content and design. XML (a generic tree description language) and XSLT (a generic tree merging and transformation language) offer a very elegant way of accomlishing that goal. The cocoon project is currently focused mainly on two goals: creating (and implementing) a standard way to create XML fragments dynamically, and determining (and implementing) the best way to maintain a site back-ended by XML and XSLT. I encourage brave developers to come check it out - the basic stuff (XML+XSLT -> HTML) works very well, the more elaborate stuff (SQL,LDAP,POP3 -> XML+XSLT -> HTML) is coming along very well, and we're playing with a very interesting take on the whole *SP paradigm called XSP - I was personally highly skeptical at first but am beginning to see the light.

    As far as IBM's product goes - once you drill down into the technical details, it looks very much like cocoon. Interestingly enough, some of the closed source components that IBM's product relies on were donated a few months back to jump start the xml.apache.org site (namely, the XML4J parser and the Lotus XSLT processor). The main thing that IBM seems to be offering here is its 'transcoder' technology - which may be interesting and certainly bears investigation, but for my money, you're better off checking out (and having a voice in the development of) the open source apache projects.
  • Hey all,

    Anyone who says "XML is one of these words everybody's talking about yet no-one really knows how to use it in specific applications or server technologies." has probably not noticed the whirlwind of activity (including many bona-fide commercial ventures) surrounding XML.

    Hundreds of site today buy syndicated news from central sources (iSyndicate.com and newsreal are two that come to mind) and receive their news feeds via XML. Also, check out webmethods.com -- here's a phenominally successful company whose entire business model is based on XML-enabling businesses.
  • xml rocks. every piece of online information should be in xml. usability on the web is horrible right now. the fact that search engines and yahoo-style directories are the main entrances to the web is horrific. the fact that google can't find me a single page on gkrellm (a kick-ass system monitor for linux) pisses me off to no end when i'm bored with my current skin. with everything in xml the extraction of data would be much simpler and therefore the interfaces to the web would be much more effective.

    the current problem is that

    1. lots of people know what xml is, but don't really know what to do with it.
    2. the processing of xml data at this point is very intense. rendering an xml web page (or add in the scaling of images, too, and call it transcoding as ibm does) takes a lot of work on the server side and there's not currently a way for it to be rendered on the client-side (browsers don't support this yet).

    i'm working on a solution and need help...so it's actually pretty smooth that this article came out in ./ at this point.

    in a huge blow to problems #1 and #2 above (as well as quite a few others), i am initiating the creation of Uberbia, the most open source of web sites. the backend is zope, which is a tres cool open source web application environment which can conveniently output its internal data as xml. what this allows is for information to be created in zope and stored in zope's native db format and served up as web pages (for instance) quickly, but then also output as xml. problem #2 solved. and when browsers can handle the xml...shove it out that way.

    zope also allows for information to be very easily created and shared. this is one of the main goals of Uberbia.

    the idea for Uberbia was born out of the fact that the Open Source community has been living in an environment of relatively closed content management on the internet. Sure, one could create a web page and post a HOWTO they just wrote. And then post a message to a relevant mailing list letting everyone know that resource is available. And then submit the HOWTO to the LDP and wait for it to be approved and posted on the LDP page. Uberbia will remove a lot of this hassle and allow the Open Source community to easily create and manage it's content. and the data will go into an xml-aware application. problem #1 solved, at least for the Open Source community. well, okay...so i'm still workin' on it, but it'll get solved, dammit.

    on trying to figure out what i was talking about, Ethan (a friend and to-be-developer of Uberbia) wrote:

    sounds to me like you want to build an open-content information space. am I totally off-base? Bring "source" up to the next level of abstraction? Collaborative environments of information?

    yup. he gets it. but the possibilities that arise from having such a body of contributors and open content in xml are insane. for example, imagine turning on a "newbie" feature in Uberbia that automagically inserted links to the proper entry in the jargon file for every word that was defined there. not difficult with zope and the data in xml

    so, essentially i'm responding to this ask slashdot question by calling out for help with an open source project that wants to solve this problem and others. some work has been done, but there's a lot more to do. sourceforge is graciously both hosting the development of this and hosting the project itself. if you are interested at all in the development of something like this or have some really smooth-ass ideas, let me know [mailto] or join the mailing list [sourceforge.net].

    i hope some of that made sense.

    word, Uberdog

  • Uh.. is it just me, or does everyone want things to go _slower_? In another words, even on 28kbps modem, I'd usually rather have _all_ deja query results fetched, and have some transfer wait, rather than (N results found. )..

    Maybe I'm just graitutuous waster of bandwidth but I'm starting to lean heavily towards writing proxy that _automatically_ grabs all links' content on a page when I hit the main page itself.

  • It isn't too bad, either.

    If no XSL stylesheet is applied then it displays the XML document using a "TreeView" default style sheet.

    Also, because the XML parser & XSL thing is COM based you can use it in any language that supports COM - like Javascript/VBScript/ASP. I hate to be a MS lover, but unless you go to Java there isn't much that can do it better than that.

    The new XML parser that comes with Win2000 is supposed to be 5 times faster, too. See MSDN [slashdot.org].

    As far as I know there is no support in IE5 for XML+CSS. I may be wrong, there, though.

  • Do you have any performance data yet? How does it compare to LotusXSL/XML4J or XT/XP?
  • You couldn't do it with HTML, either, could you?

    Any server that uses stateful connections like that is going to have to be big & powerful.

  • by X ( 1235 ) <x@xman.org> on Sunday January 16, 2000 @11:08PM (#1365734) Homepage Journal
    I think you're not looking at the problem the right way. Typical applcation development breaks things up into domains. These layers usually include a persistence domain (your database), a business logic domain, an application domain, and and a presentation domain.

    XML really doesn't change any of the domains EXCEPT the presentation domain. You don't need an XML enabled DB, as you NEVER want to have the outside world talking directly to your DB. XML (combined with HTTP or whatever else) is one way of presenting your application. The various transforms that you would do using XSL are just "aspects" of the same presentation. So this doesn't completely change the way you build applications, just how you do your presentation.

    I've written more than a few apps that were available both as GUI applications and web servers. Both versions shared the same code base up until the last layer.

    As far what you need to do an XML system, I think it's a lot like an existing HTML system. With HTML, you need a database server, an app server, and a web server for an HTML system. The web server is normally scripting enabled so you can do handy transforms with the raw data.

    With XML, it's basically the same concept, except your "XML server" needs to be using XSL to script transforms of the XML data. What we currently don't have is a very good way of doing this. Ideally you'd actually want the CLIENT to do the transforms as the XML data is usually much terser than whatever the XSL will generate. However, nobody trusts the clients to do this, so you might as well go with the XSL engine on the server.
  • by evlist ( 138506 ) on Sunday January 16, 2000 @11:13PM (#1365735) Homepage

    But what tools are available to actually incorporate XML in a system that can do all things we poor webdesigners dream of?

    There are many tools available to build such a system.

    To mention only Open Source projects, I could suggest using Apache JSERV [apache.org] with Apache Cocoon [apache.org] as a framework, Castor [slashdot.org] or Quick [jxml.com] to bind XML data to Java objects and a OODBMS like ozone [ozone-db.org] or a RDBMS like PostgreSQL [postgresql.org].

    These are my favorites ;)

    They are very powerful and highly flexible, but the price to pay is that they are rather complex to use, that you need time to get on speed with them and that you loose focus on the core techniques behind them.

    To try to get a good understanding of these core techniques, I have set up some simple examples showing how one can bind XML documents into java objects, store these objects in a OODBMS and use them in a XSLT sheet both in standand alone mode or as a servlet.

    These examples are available on our web at http://downloads.dyomedea.com/java/ [dyomedea.com] and a mailing list [egroups.com] has been created to exchange and discuss such basic tips.

    Hope this helps.

    Eric van der Vlist

  • i wouldn't say that webmethods is phenominally successful...check out this [xent.com] item on them filing for IPO on FoRK (think of fork as slashdot for non-trolls). they're another company with shaky financials and a story. anyway point taken about xml being used fairly often now.
  • To be honest, I really don't know how it compares to LotusXSL or XT/XP. I'm mostly concerned about getting it to work for now, though I do plan on comparing its performance with other XML parsers. Once I get the XSL up and running, I can compare that a little better.
  • by anthonyclark ( 17109 ) on Sunday January 16, 2000 @11:37PM (#1365739)
    Looking at any non-trivial XSL stylesheets, you can see what a generally bad idea it is. My advice would be to use a real programming language with DOM bindings.

    I wouldn't write off XSL on the strength of that article at xml.com...

    When I first looked at XSL some months ago, I thought that it would be a messy and difficult language. I was wrong. XSL, IMHO, is the right solution for translating XML into pretty much anything. Yes, it does have a steep initial learning curve (much like our favourite OS :-) but once that is out of the way, you understand why the language is so useful. Why does it look so unwieldy? Because it's a "dialect" of XML. (Which I think is a good thing - it shows how flexible XML is) Typical XSL is as simple as saying "if you encounter this XML element, do this with it." Editing XSL text is really quite easy with the correct syntax highlighting. (TextPad [textpad.com] is a good editor under windows)

    As for non-trivial XSL stylesheets? On our project, we have written XSL to transform our XML data into binary outputs. The stylesheets used ran into tens of thousands of lines! I think that qualifies for non-trivial in anyone's book. I admit that the XSL is difficult to read, but show me any source that is easy to read when >10k lines...

    XSL as a complete solution? No. Even in a relatively simple XML to HTML documentation tool I wrote, I called the XSL from a JavaScript app that handled things like file access and other helper functions. This was under Win2k, using the built in script engine to call the XSL via COM. (yes, even MS get's things right sometimes) The point is that XSL is better for tranforming XML than trying to use a DOM-manipulating language binding...

    On another note, why does everyone assume that XML is solely for exchanging data on the web/net? I've used it for documentation, log files, test cases, application persistence and application exchange formats. It's a lot more useful and flexible than people think.
  • by rjb ( 137100 ) on Sunday January 16, 2000 @11:39PM (#1365740) Homepage

    You might like to check out this page [xmlscript.org]. One of the things they have is an interpreter (X-Tract) that reads a template (written in XML!) and performs pretty much arbitrary transformations on XML input data based on this template. Looks pretty cool and simple to use. X-Tract is free for download. Funny I didn't find any info on license terms though.

    I tried doing some very simple stuff with the Linux version, and the only complaints I have are:

    • fetching the input data via HTTP doesn't seem to work (as it should according to the docs)
    • when I tried calling it from a CGI it freaked out, seems that env variables override explicit XML Script commands in the template -- not what one would expect. Fixed it by clearing the environment
    • the docs, though pretty exhaustive, are not very reader-friendly (to me)
  • by hqm ( 49964 ) on Sunday January 16, 2000 @11:40PM (#1365741)
    You should take a look at MetaHTML (www.metahtml.com), which is a sort of macro
    like programming designed to emit HTML (it
    was developed before XML was invented). It
    was developed by Brian Fox and myself when
    we had a company called Universal Access (ua.com). MetaHTML
    is superior in some ways to XSL, because it is
    more a general purpose programming language, yet
    it's evaluator does a lot of the work of parsing
    XML syntax expressions. We used to use it
    to do many XML-ish things, such a generate the
    MetaHTML documentation automatically from a
    structured representation in the database.

    MetaHTML has also been under GNU public license since about 1996.
  • As someone mentioned earlier, XSLT can often be a real pain to work with, owing to its insistence on being "side-effect free" (so variables aren't really variable, for a start) and its declarative syntax. An alternative which still has the advantage of being written in XML is "XML Script": XML Script homepage [xmlscript.org] (Note that the version on the site is XML Script 1.0 - v1.1 will be out sometime this week, we reckon)
  • I have never been able to find a clear definition for XML. I have seen languages that claim to use XML, and I have seen pages that supposedly help you with XML, but I have never seen it defined. Which leads me to one of the following suppositions:
    1. Nobody knows what XML is, and are just trying to be cool by saying that their product uses it or that they know how to use it.
    2. XML in fact does not exist. If this were the case, the first supposition could and most likely is still true.
    This disturbs me greatly, because the XML people hyped it up to the point where this disturbs me greatly.
  • by PigleT ( 28894 ) on Monday January 17, 2000 @12:16AM (#1365746) Homepage
    Well that's unfortunate. A very quick trip straight to the Web Consortium [w3.org] shows their pages on XML [w3.org] straight up, complete with links to the XML FAQ [www.ucc.ie] and of course, just what you always wanted, the XML 1.0 Spec [w3.org]. If that's not an adequate definition, read the source for your favourite parser!
  • Could you comment on what I posted a bit above? At the expense of being redundant:
    • What are the license terms of X-Tract? Will it be open source?
    • Why doesn't work with the Linux version (substitute a valid url of course)?

    Otherwise, nice work -- keep going!

  • This may seem like a stupid question, but what does it do? Right now I can get complete site dynamics with CGI programs, so how is XML an advantage?

    This is not a troll, I seriously don't know and want to.
  • On our project, we have written XSL to transform our XML data into binary outputs. The stylesheets used ran into tens of thousands of lines!

    This is supposed to be good? Something is horribly broken. Perhaps a different tool would be more appropriate? How about a parser generator? (see Jikes [ibm.com])
  • Could you comment on what I posted a bit above? At the expense of being redundant:
    • What are the license terms of X-Tract? Will it be open source?
    • Why doesn't <_data file="http://wherever.org/junk.xml" /> work with the Linux version (substitute a valid url of course)?

    Don't know why this comment got mangled the first time, it did look right in the preview...

    Otherwise, nice work -- keep going!

  • humph ok. to tell the truth i wasn't looking too hard.
  • I work for AssureSoft whose AssureWeb website is live (work out the URL for yourself, it's not obscure but we don't want to be slashdotted). The site provides financial information to subscribers. You have to have a username and password to get the full range of services- we dole out passwords free to British independent financial advisors.

    Our first XML-based service is a quotations system which allows users to get a quote for a pension or mortgage from a wide range of companies in real time (typically 5-20 secs).

    Why we needed XML

    Our problem was that each company had a slightly different way of asking for customer details. We decided to create an XML data type definition, now adpoted as industry standard by UK financial standards body Origo. This standard means that we can present pretty much the same input form, with a few optional extras, for any financial product.

    The main use of XML is in passing the input data from our web server to the companies' quotes servers.

    Layer 1: Client Browser
    Layer 2: AssureWeb server
    Layer 3: Company Quotes server

    The XML goes back and forth between layers 2 and 3. We compile standard CGI GET/POST client requests into XML on the webserver and fire them at the quotes server. The quotes server fires back a response as XML again, and we parse this and present it to the client as a standard HTML web page. There is no XML on the client side.

    Provided the company quotes server conforms to our XML standard, we can use that server for quotes. Adding new products or companies becomes a lot easier- typically we can go from scratch to beta with a new product within days. Previously it would have taken many months to write and test each individual product. XML allows us to re-use both code and input/output standards to a level never seen before.

    Our next step will be a comparative quotes service. Users will be able to enter one set of data, and fire it at multiple companies. They will then get back multiple quotations, from which they can select the best based on their criteria. Effectively we will be having multiple concurrent layer 3 transactions.


  • I work for AssureSoft whose AssureWeb website is live (work out the URL for yourself, it's not obscure but we don't want to be slashdotted). The site provides financial information to subscribers. You have to have a username and password to get the full range of services- we dole out passwords free to British independent financial advisors.

    Our first XML-based service is a quotations system which allows users to get a quote for a pension or mortgage from a wide range of companies in real time (typically 5-20 secs).

    Why we needed XML

    Our problem was that each company had a slightly different way of asking for customer details. We decided to create an XML data type definition, now adpoted as industry standard by UK financial standards body Origo. This standard means that we can present pretty much the same input form, with a few optional extras, for any financial product.

    The main use of XML is in passing the input data from our web server to the companies' quotes servers.

    Layer 1: Client Browser
    Layer 2: AssureWeb server
    Layer 3: Company Quotes server

    The XML goes back and forth between layers 2 and 3. We compile standard CGI GET/POST client requests into XML on the webserver and fire them at the quotes server. The quotes server fires back a response as XML again, and we parse this and present it to the client as a standard HTML web page. There is no XML on the client side.

    Provided the company quotes server conforms to our XML standard, we can use that server for quotes. Adding new products or companies becomes a lot easier- typically we can go from scratch to beta with a new product within days. Previously it would have taken many months to write and test each individual product. XML allows us to re-use both code and input/output standards to a level never seen before.

    Our next step will be a comparative quotes service. Users will be able to enter one set of data, and fire it at multiple companies. They will then get back multiple quotations, from which they can select the best based on their criteria. Effectively we will be having multiple concurrent layer 3 transactions.


  • I've been wrestling with some internal docs at a client site. How could we transmit the internal data using standard doc types, when I bumped into the following (at learned how wrong I was, for my case in particular).

    http://www-4.ibm.com/software/developer/library/ meaning.html

  • Licensing isn't mentioned on the site because it hasn't yet been fixed. It is currently not open-source, although X-Tract is and will remain free for non-commercial use. Other aspects of licensing very much depend on how we sell XML Script server apps to commercial companies, and that's yet to be decided.

    As to the _data file="http://...", that's a bug - could you please send this to support@xmlscript.org so we can get to the bottom of it?

  • I'm one of the XML Script developers, and it looks like today is the day for bug reports!

    We're not sure why the URL retrieval isn't working (it's another library that does this for us, so we'll have to look into it further) but can you send us a bug report on the CGI thing? support@xmlscript.org [mailto] is your best bet.

    We know the documentation isn't the best in the world, and we have been working on it lots for v1.1 (which you should expect to see later this week). Thanks for the feedback.

  • IE5 XSLT is very different from the W3C recommendation. It is a partial implementation of a 1998 working draft.
    Do not assume this to be a case of embrace & extend. Microsoft just implemented XSL before the spec was finalised. They say they will bring out a compliant version soon.
  • XML is useful for many things. In fact, the next version of HTML (called XHTML) will be based off XML.

    One use for XML is that you can develop entire sites using your own tag set instead of HTML. For example, if you want to represent a list of books in HTML, you would probably setup a list of items. In XML, you can do:

    <name>Some Book</name>
    <author>Some Author</author>

    Which is much easier to understand. Using XSL (a stylesheet language for XML) or a parser built specifically for your tag set, that <book> tag and its subtags will actually mean something.

    Writing your entire site in XML has other advantages. For example, let's say you have 100 pages on your site, all written in HTML. Now you want to change the layout of the entire site. You would have to modify the HTML of all 100 pages. If all those pages were written in XML, however, you would have to modify only one file, the XSL stylesheet.

    XML also has support for namespaces. A namespace (in XML) is a group of tags. Each namespace has a URI. For example, the upcoming XHTML 1.0 namespace is http://www.w3.org/1999/xhtml (that link does not actually exist though). Namespaces are very useful. If you were writing a document in XHTML and wanted to include tags from your own tagset, you would call in your namespace, and you would suddenly be able to use your own tags.

    My parser will have XSL support soon. For now, you can write the modules in C/C++ and the parser will load them automagically using the namespaces and parse the XML.

    I have a few articles/tutorials I've written over at gelicon.com [gelicon.com] on XHTML, XML, DTDs, and namespaces. Hopefully they will offer a better understanding.
  • In my experience you can do the same thing if you have a proper middle tier (application server) between the database and the client. There are roughly 1000 products that do this, and none of them need XML to do it.

    XML is just a consistent way of presenting information, not some major enabling-technology.
  • There are two parts to XSL.

    XSL Transformations:
    Transforms any XML document type into another. This can include HTML if it is well formed e.g. XHTML. In reality, it really is not just for "Stylesheets" but can also be used for data to data transformation. The W3C have published a recommendation (their version of a standard) and there are many implementations.

    XSL Formatting Objects.
    Formats XML for print or screen display. Powerful, complex typesetting-style system, you could use the analogy "PDF/Postscript for XML". Not a standard yet, and only one partial implementation of an old working draft (FOP).

    A lot of the guys criticism in that article refers to the second part of XSL, which is not what people are using, or referring to when they discuss XSL here.
    I don't find the guys article that persuasive, it is full of assertions, without proving them. Most of the guys gripes are directed towards formatting objects, which is complex, but the momentum behing XSL relates to XSL Transformations.
  • I think that a company called InfoShark, and their product called XMLShark, will allow you to read data from a database, modify it and then write it back again.

    InfoShark [infoshark.com]
  • Not with a 2GB database you wouldn't want that. I've seen a quite nifty database select on the microsoft.com site that execs a query without refreshing the whole page like most sites do (and no it doesn't use frames for that). It would be even nicer if it actually gave you something useful, but that's microsoft for you.
  • Or Metamata Parse [metamata.com], which is not open source but still has some nice features.
  • By the way you're putting the problem, it seems that XSLT is the answer for your questions.

    The Apache XML project has a XSLT processor called Xalan that can take care of much of that part (I haven't tested any other XSL processors yet). Just link your XML document / DOM Tree to a style sheet and you have a transformed document to the format you like.

    The only reason I see that this is needed is because nowadays only IE 5 and Mozilla can work natively with XML files and linked Style Sheets (and that locks you to CSS for Mozilla), so if you plan to use XML with any other device, AFAIK, you will have to use some kind of tranformation processor. It can be used to tranform a XML doc to another XML doc, but that escapes from the presentation field.

    Just take a look at their page and make some tests. They're pretty nice tools, and quite easy to work with.

    Marcelo Vanzin
  • We don't need another data exchange format. XML is pretty adequate for more advanced stuff we have SGML.

    What we do need are tools to manipulate XML. The tools for reading and writing XML are already there. What we need next is tools to transform XML documents (the standard to specify these transformation already exists: XSL).

    I think there are several initiatives in this direction. (sorry I don't have any references).

    Like many people I see a great future for XML but I think the coming few years will be characterized by a lot of redundant programming since everybody will individually attempt to implement more or less the same components. It would be nice to see some reusable components on the serverside.
  • I would just like to say that learning curve of XSL is due to grasping the concepts of how to use it, rather than the language being crypticly designed.
    You create templates to match the different kind of elements, and work your way down the tree of the document. This approach allows the stylesheet to work with documents which different numbers of elements, or slightly different structure. Some problems are solved with recursion.
    You can do a simple approach where you have a fixed structure document, and insert values from the XML at certain points. This works for a lot of problems.
    The main problem I had when learning XSL is study material. The specifications don't function as a tuturial. I recommend http://metalab.unc.edu/xml/b ooks/bible/updates/14.html [unc.edu]. It is a version of the chapter on XSL from the XML Bible, updated for the W3C recommendation. I wish I had found it sooner (I have the book, by the way, very good).
  • XML will actually allow for less network trafic. The reason for this is that you will be able to manipulate an XML DOM tree on the clienside with for instance javascript. This means you won't need to contact the server for simple things like sorting a list.

    A second reason you'll have less network trafic is that you don't have to put layout information in the XML files. Rather you download a separate XSL file (which can be cached). Subsequent communication consists of data only.

    Microsoft has some nice demos on their site (yes I know it's propietary and all but it's there) and I think mozilla also has a few nice demos.
  • People have discussed the database connection. I came across this article at Javaworld (via the Javalobby site). It describes a different way of using XML with databases.

    Instead of converting the entire database to an XML file, which consumes a lot of resources, and has synchronization issues, this approach places an XML API frontend on the JDBC system. This creates a "virtual" XML document that other XML tools can access via DOM or SAX.

    For example, they create a SAX frontend for JDBC, and use it with a SAX-based XSL tool (XT) to transform the data to HTML. So, for example, where the database encounters a column for CustomerName, the template for a CustomerName entity in the XSL sheet is triggered. To the XSL tool and stylesheets it seems as if they are accessing an XML document.

    http://www.javaworld .com/javaworld/jw-01-2000/jw-01-dbxml.html [javaworld.com]
  • by tgd ( 2822 ) on Monday January 17, 2000 @02:13AM (#1365769)
    A small warning for those thinking about moving down the XML/XSL route who haven't done any testing on it:

    Its slow. VERY slow.

    Most XSL implementations have significant performance and scalability issues as compared to more common custom technology for producing dynamic web pages.

    There's no argument that its a better technology, but I've known several commercial web sites that have spent considerable resources developing XML/XSL implementations and having to roll back the technology when they discovered they needed four or five times the number of servers to be able to use it.

    Anyone know of any top-tier sites that are actually using the technology?
  • by Matts ( 1628 )
    If someone wants that they can either use NNTP or develop it using the current mod_perl + HTML route - there's no need for XML there.

    XML should be used where its appropriate. I'm unconvinced that client-side transformations are the right thing.
  • One approach might be to treat DTDs similarly to interface definitions (as in IDL) and keep them in repositories by ORB-like intermediaries. XML documents are, after all, just instances of a particular DTD.

    This has the advantage of reusing existing (ORB) technology for new purposes, and fits into an existing ideology that many already understand.

    You would put a client of one of these XML ORBs into Apache or your browser client, and be able to exchange documents and DTDs freely just as with code objects and traditional ORBs.

    Or so I would hope. :-)
  • Here at imediation we developped an XML based web application, and we find very difficult to do a _good_ DB mapping. Basically you can easily find a JDBC/XML mapping to translate a table into XML but when it comes to more sophisticated mapping, we could not see anything good. For instance expressing entire query in "pure" xml (ie without embedeed sql statement in xml doc) with joins, expressing insert/update query, whatever. So we developped our own but this is not efficient, as this is not standard. Any pointer/ideas about generic/powerful XML/SQL mapping, or incoming standard about that ?
  • Ok, I may be oversimplifying, but it seems to me that if one requires a system that is faster than a relational database system, you could wire it so that at some set time, all your flat HTML pages are built from your database and stored in some local file system. This way you even avoid having the overhead of parsing the XML and serving to the client. All of your relations are kept natively in your database, but 99% of your users are getting the fastest possible versions by reading static HTML.

    I saw a briefing from the W3C at the last Builder.Com conference and he had some interesting things to say. Specifically he stated that the best uses he could see for XML right now are long term storage and data conversation. Without XQL being anything but vapor right now, searching and parsing it on the fly is a nightmare.

    Just seems to me that with an Oracle 8i backend database that burns out static HTML if you need speed would be simpler than trying to incorporate an XML solution unless you needed multiple systems with different architectures to work with the same data. Even in that case you could just parse the database data into XML pages instead of HTML.

  • I don't mean printing the XML itself, but using XML to fit data into a template which is then printed?

    I'm thinking of something along the lines of Formscape which can format data for invoices and purchase orders and such.

  • by Matts ( 1628 ) on Monday January 17, 2000 @04:13AM (#1365776) Homepage
    Mind if I ask why you're doing this? XML parsers are off-the-shelf free commodity tools now.

    Spend your time working with those tools (XML4C, expat, rxp to name a few) to create higher level tools. Don't re-implement an XML parser - I can guarantee you it will be full of obscure bugs where you didn't understand the spec, didn't understand how to cope with character encodings, or just did something wrong. This stuff, despite the XML spec suggesting that a graduate could write a parser in a matter of weeks, is hard, and experienced people (such as James Clark) have put out excellent products for all to use under non-restrictive licences. Theres even an LGPL parser already out there called libxml (ships with gnome).

    If you don't believe you'll create a broken parser, see the recent XML conformance tests on XML.com.

    I'd also love to see you move from a non-working XML parser to something supporting XSL "in the near future". I appreciate your enthusiasm, but the XPath spec has some tough little nuts to crack (I know - I'm cracking them right now) and then implementing XSLT from an 80-odd page spec - wow - good luck to you!

    (I'm not trying to poo-poo your project, but so many people start working on stuff that's already being worked on in the open-source community that it's just wasted effort).
  • I do a lot of work with these up and coming ecommerce companies, all of whom say they "do" XML. It is the most popular interface, in the products they are developing, to have what I commonly refer to as a "repository model" where systems spit data into your XML-enabled system in whatever format you want (EDI proprietary formats, regular HTTP, ACSII etc) and you play around with it in XML (a lot better for data manipulation and content mangement purposes) and then spit it back into whatever flavor your suppliers/customers want it in. This is what I am getting a sense for in the IBM model. This is not so much a "new" way of doing things but increasingly the standard.

    Fact is, XML is great for data interchange, plugging large ammounts of standard infomration into standard forms (PO's, RFQs and other business docs) as well as putting some muscle into search engines via context based searching (via XML metadata) but there are way too many standards out there.

    - BizTalk - This is the standard, open nonetheless, that MSFT [microsoft.com] is developing to standardize XML. It is an open standard, but the obvious benefit to MSFT is that they can plug Biztalk functionality right into all of their product lines for interoperability across a platform.

    - OASIS's XML.org - OASIS, a non-affiliated standards body, much like W3C, set out to develop a standardized set of XML schemas and DTDs (document type definitions) however, MSFT beat them to the punch by launching their BizTalk site a day before OASIS, ahhh Microsoft, finds a way to compete even in open standards.

    - RosettaNet - These guys set out to "map" all common business processes and to make an open standard for XML in the business world, but, alas, mapping entire processes takes a long time, a lot of notaeriety here, not as much substance.

    These are just a few examples, there are others, but, my guess is that you'll hear the most about these folks. To make things even more complicated although these guys seem to be "competing" they are almost all members of each others' groups, in a sort of "coopition" model. So, overall, it is no wonder why the big push is for standards repositories, and related transaltion to an from various formats.

    That's my $.02
  • Just a couple of pseudorandom notes from our own experience (we have some major apps going on line now) --

    On performance, I really matters what kind of parser you use. There are two standard parser interfaces:
    • SAX (an event driven interface) and
    • DOM, the good old document object model that is tree based.
    Both XSL (XSLT + XSL FO) and DOM look at an XML document as a tree to be manipulated appropriately, while SAX treats the document as a stream of tags to be managed by handlers. DOM is powerful, subtle, and (in many cases) slow. If you build an application around a DOM centered parser (and most are), you may have performance issues. YMMV, as always. SAX is not as powerful, you have to code more, but it is faster. More than one project in the B2B area that started with a DOM parser is looking now at SAX. There is nothing wrong with DOM based parsers and we use them a lot - but watch out for performance.

    There has been a lot of argument this year over whether or not to use XSL to style XML documents. I think the jury is still out on this -- at least as far as pure display style is concerned. (There are a lot of CSS loyalists out there as well.) But XSLT as a transformation language for XML is a real winner. One of the reasons is simple but profound -- XSLT is XML and is parseable and transformable just like any other XML document. You can create a stylesheet by using another specialized XSLT sheet to transform an XML or XSL document into the stylesheet you want. This can be very powerful, but difficult to debug.

    Finally, I am surprised that nobody on this site has mentioned the expat (stream based) parser by James Clark that is an almost standard part of the modules for Perl5. I am learning Perl using the ActiveState port on NT and am having a whale (camel?) of a time, and the expat parser is clean and fast and fun.

    Oh, and one final note -- while there are some really useful books on XML, I suggest you keep to the basic reference type (Neil Bradley's The XML Companion is next to me on my desk right now, and there is a second edition out) and use the net as your basic resource, especially lists like XML-DEV. Things are moving way to fast.

  • This may be way offtopic here, but since the topic covers XML I thought I'd ask. Is there any word processing application of the Word, WordPerfect, StarOffice genre that uses SGML to describe the text? And if not, why?

    SGML or XML would seem to be perfect for an open source word processor. One of the biggest obstacles of exchanging information in business is the many proprietary document formats. It would seem that if such a program could become the standard (I know that's a big if), it could be a potential killer app for linux in the business world. Especially if it came out on linux first. But even if it didn't, the linux version could be free whereas a windows version would most likely be proprietary. And I would place far more trust in an open source application complying with standards than I would one which is closed.

    I know word processing isn't fun or sexy, but its an extremely important part of computing and should receive more attention than it has.

  • Um, I should have checked this before my earlier post. Apparently, AbiWord uses XML.

  • This question is what the people on the Apache XML project [apache.org] spend more or less all their time not just talking about but building stuff. If you care, join up.

    Having said that, XSLT may be magic, but "old-fashioned" solutions like PHP and Zope and plain old perl-backed CGIs (perl includes an excellent XML parser) ain't going away anytime soon.

  • Exactly. Wasted effort. See xml.apache.org [apache.org] for XML parsers and other tools, some implemented in multiple languages.

  • XML doesn't solve this problem either. Writing a different stylesheet for each browser winds up being just as much work. The key is to get all of that work out of your source code, so that it is independent of the application. You can do that by using a template system.

    The IBM example has multiple sources of documents feeding multiple target formats, where those targets are diverse--not just different forms of HTML, but different media altogether. In those cases XML is a big win.
  • XML is one of these words everybody's talking about yet no-one really knows how to use it in specific applications or server technologies

    I disagree. Check out the W3C's SVG standard [w3.org]. This is for real.

    If you've ever had to muck about with all of the different proprietary flavors of vector graphics formats, you know what a great thing this will be.

    That said, I personally *don't* believe in across-the-board XML standardization panacea. Some things deserve standardization, others don't.

    Accountants all adhere to accepted standard accounting practices. This is what makes it possible to encapsulate their work into shrink-wrapped database products that pretty much any accountant can use. But this only works because the process is so well known.

    So I disagree vehemently that business-to-business transactions, for example, are ripe for XML standardization. Why? Because who the heck is such an expert on these kinds of transactions to be telling everyone else how to do it? There's a lot of trial-and-error to go through before anyone should start proposing standards.

    And remember: "You can't vote for anarchy". ;~)

  • by Anonymous Coward

    My reason for going on about multi-user, record locking databases is this :- Assume you build a good web site, nice and fast and so on, used by many people. I would suspect that as, per the old adage 'No good deed goes unpunished', your boss would then ask you to build a more interactive site.

    Then you realise to your horror that XML doesn't really help at all when it comes time to trying to re-mesh updated/changed XML 'data bursts' back in to the main DB.

    Another thing that just occurred to me - Surely the queries needed to get the hierarchical data have to be expressed in SQL. If so, surely the cost in terms of logical/physical reads (i.e. the cost to the server of doing the queries) will be the same whether you do them all at once, to build your XML 'data burst' or whether you run them just as the user requests them.

    In Oracle you can keep open connections to the server at all times (and even pre-start some at DB startup) i.e. the connection latency is very small. I think SQL Server would have to be configured to pool connections in some way. Does SQL Server 7 let you do this? Does MTS let you do this? I'm not sure.

    BTW what are your feelings on MS having to delay the In-Memory Database and COM+ (component) load balancing. As I remember they had to drop them from basic W2000 Server and have said you'll get them in the W2000 Datacenter edition. It might be that without these features your DCOM and MTS architecture might run out of steam. (You might even have to tell your boss to splash out on W2000 datacenter edition as well!).

    Just some thoughts.

  • IE 5.0 does this pretty well already using XML/XSL.

    Here's an interesting example:
    XSL Sample [microsoft.com]

    Which is from the following article:
    Choosing between XSL and CSS [microsoft.com]

    Of course, solutions like this for general websites aren't very appropriate yet for public websites, as they require IE 5.0. But the technology is very exciting.

    There are several other examples on this site that utilize client-side XML processing to dynamically change the way data is displayed - sorting a baseball roster by name or batting averages, or even calculating and displaying statistics on the client.

  • Convert the demos you saw to ordinary HTML (without losing features) and you'll see the amount of communication increase since you can't do much at all on the clientside. With HTML you always have to transfer layout information since that's all you got. You can to some extent manipulate the client side DOM model but in practice you'll let the server handle more complex things like sorting data (resulting in a reload of the entire page rather than a small XSL file).

    Anyway, I don't think that the bandwidth problem is caused by either HTML or XML. The real problem is the objects that are referenced like for instance gif or jpg images and that won't change I'm afraid.
  • Does XSL have modularity? No

    Does XSL encourage reuse through its syntax? No

    Does XSL base its constructs on proven language design ideas picked up in the last twenty years? No

    I have no idea why people are so ga-ga over a language that predates Algol-6x in its design.

  • XSL provides no features for reuse or modularity, so any useful stylesheet ends up being monstrous.

    For someone who uses a language like Python or Java, I can't imagine why they would find anything compelling about XSL. It really is a dog language. Most people are just too ga-ga over the fact that it is encoded in XML to see how lame it really is.

    Thankfully, few people are rallying behind it.

  • Isn't this what the Cocoon project does? You list the stylesheets at the top of the code and then Cocoon selects the proper one based on the client ID.
  • Very cool.

    If you "already do this [convert XML->HTML or XML->WAP]", how does that work? Is it custom?
  • I have written an article that will help you XML-newbies get up to speed on the idea of XML and some of the sub-specs. The Promise of XML. [targetpc.com]

    I believe eventually we are going to get to a point where server-side transcoding will not be necessary. However, this will be several years, and we are going to have to learn how to do all of this efficiently.

    I am even developing my own transcoding software process because I belive I have a better method of doing it than what is currently available. If and when I do succeed it will be closed-source because I want to make money off of my product, not just give away all my hard work.

    Anyway, the next few years are going to be very interesting.


  • XML is a metaformat, and XSL is an XML-conformant format.

    This format in particular, offer no modularity or reuse features, and there is nothing about XML that strictly forbids such features.

  • Aha.. thank you.
  • While I think you're description of the history of XML is quite interesting, there are some inaccuracies in there which are somewhat misleading.
    Call me pedantic, but I have some issues with the following statement:

    HTML and XML are related formats; in fact, HTML can be defined as a subset of XML.

    This is a bit of a peeve of mine. HTML is an application of SGML, not a subset of SGML, and definately not a subset of XML.

    A lot of stuff that's in HTML is not legal in XML, like the IMG tag and the OPTION tag:

    Which is why XHTML [w3.org] was created.

  • I'd like to echo your shout out to James Clark's products. On the Java front, his XT library implements XSLT, and uses a SAX parser (which, as was pointed out, implies better performance than DOM).

    http://www.jclark.com/xml/xt.html [jclark.com]
  • A good quote recently on the XML lists was as follows:

    XML only solves the problem of data formatting.

    There are some doc-heads out there that are trying to wrap XSL, XQL, XPath, and some of the other proto-standards into one cohesive view of the world, but it really isn't there yet.

    SQL databases are still the way to go for storage - more due to uptime and recoverability than anything else. Also, regular programming languages such as Python and Java, when used with DOM bindings are still a more powerful, efficient, and flexible solution than XSLT or XSL-FO.

  • > A lot of stuff that's in HTML is not legal in XML, like the IMG tag and the OPTION tag: Sure it is, in well formed XML documents. Just don't expect it to be understood by any other XML-based-language processor. Transcoding is a good idea, but the hard work isn't in the transcoding infrastructure, it's in the style sheets. Also, there's several commercial offerings in this space that have been around for a while; Spyglass Prism OnlineAnywhere (acquired by Yahoo) Proxynet (acquired by Puma) Argogroup Actigate MB
  • I agree that too much work is being done on software that has already been developed. However, I have not yet found a XML parser that does what I've needed it to do. Most of the parsers out there are written in Java, which won't work on all servers. They also are built to handle specific types of XML only. My parser can handle as many types of XML as you want in one file using namespaces. It is also written in C++, so it should work on any *nix system.

    I'm not saying everybody has to use this program, or that it will be the #1 XML parser. I'm just saying it's something useful I'm developing, which is also helping me learn a great many things about XML development.

    Besides, it gives me something to do :)
  • My article was simplistic in how I stated that, so I will try to correct myself here.

    HTML has recently been slightly altered into the XHTML DTD.

    A person can use any XHTML DTD in any XML document.

    So saying that HTML is a subset of XML is not far from the truth. I am also willing to bet a person would have moderate success using a regular HTML DTD in an XML document, but it would not be worth it.

  • read you history junior he was a preacher in New England during the reign of the puritans.
  • The decomposition into three system elements (XML content source, Transcoding gateway, and browser) makes a lot of sense. That way the content source can focus on what it does - deliver content - and the transcoding gateway can handle the customizing the content for presentation on whatever device is making the request. The IBM Transcoding Technology (see http://www.ibm.com/software/secureway/transcoder/) is an example of a tool for building the transcoding gateway. You can download and try the beta code now. There are additional notes at this web site about other tools that may be useful in developing this kind of application. There is a short write-up on XSL at http://www.ibm.com/software/secureway/transcoder/x sl.html.

    As you hinted in your note, it can sometimes be a challenge to select the best stylesheet to apply to a given XML document. The gateway may want to choose a stylesheet based on the source document and the destination browser or device. In addition, different stylesheets may be better suited to specific user preferences or network connections. The IBM transcoding technology includes a way to select the "best" stylesheet to apply in a given situation.

    The Transcoding technology can also adapt content other than XML for different clients. HTML requires special processing because you can't apply stylesheets to directly since it's not well formed. Images also require special handling to adapt them for the destination device. The whole transcoding gateway may be a separate component, installed as an HTTP proxy, or it may be configured as a servlet on the same server that is the content source.
  • The jabber project is doing a lot of stuff with XML. I'm not sure if this is similair to the XML server that IBM is doing. Maybe someone wants to contrast them for me?

  • XML really doesn't change any of the domains EXCEPT the presentation domain.

    I only partially agree. In the presentation domain, XML can be used to isolate the logical structure of the data from the HTML/WML/etc. It's very useful for this, but beware of the slowness of XSLT (as others have commented). I found that using the fastest XSLT (the jclark version [jclark.com]) it still took around 300 ms to produce about 20K of HTML from XML.

    In my situation, much of the XML was static information, so I decided to generate JSP output using XSLT instead, since JSP is compiled; the same could be done with another compiled scripting lanuguage. What was most interesting to me was the problem of isolating the static parts of the page, which could be compiled in JSP, from the dynamic parts, which had to come from the database / application layer. In this case, the tag extensions in the latest JSP (1.1) are very handy. They allows the JSP file to be a well-formed XML document, and therefore easily generated by XSLT, and the extended tags can be programmed to interact with the application layer in a very clean way. The tag extensions could be programmed to either interact with an application object, or a XML DOM, although actually the latter is more cumbersome.

    I agree that XML is not very valuable as a direct interface to the database -- there should always be a layer between the database server to enforce access control, implement rules, etc. However, XML is useful as an exchange format between loosely connected servers, such as in B2B interactions. In these cases it is better than using distributed objects, because the coupling is looser and easier to define. But I'm of the opinion that the XML should represent a high-level operation, not database rows.

  • If anyone is interested in integrating XML delivered content into their application, at Moreover.com we've just given free access to all of our headlines from 1500 sources, in a variety of flavors of xml: moreoverxml; wddx; rss and. See Moreover News Categories [moreover.com] From our own perspective, what is interesting is that some of the more sophisticated XML-based initiatives for syndication of XML content such as ICE are over complex for many applications. Some much simpler definitions such as wddx allow very speedy integration of content and metadata into a database.
  • MultiMania's site has most of its content stored in XML. The main HTTP servers are Apache+PHP; we have a JVM running the SAXON stylesheet processor, and a MySQL database with "glue" data, telling the system which XSL stylesheet to apply to wich XML document to generate which HTML page. Some neat hacks and some smart caching even let us deliver 'semi-dynamic' pages - content stored as XML, interpreted as PHP on delivery.

    XML rocks. You don't need to stuff your head full of theoretical debates about namespaces, general entities, etc. All you need is vi (or Notepad) and Saxon. To learn XML syntax, just write XML files by hand and feed them to SAXON until it no longer reports XML errors. To learn XSL, just write XSL files until you get SAXON to actually spit out some HTML. Lots of examples are available to accelerate the trial and error process.

    When you are finally ready to integrate the whole shebang into actual applications, there are tons of open-source tools to choose from. Look at the list above again - Apache,PHP,MySQL,SAXON - cost zero - this combo drives one of France's most popular Websites.

  • by Matts ( 1628 )
    Download expat. Download the C++ bindings for it. It supports any XML format you want to write code for, has full (and correct) namespaces support, and is very much free software.

    If that doesn't bake yer noodles, download rxp which also does validation against a dtd.

    Really, work on providing XPath and XSL support for expat - the community will thank you _much_ more for it.
  • You can connection pool with SQL server 7, btw. Also, M$ says COM+ will be available "A month or so" after Win2K

Real Users never use the Help key.