XML and Transcoding - How Would You Do It? 139
"Suppose you have to develop an on-line application, and you'd want to go with XML on the server side, and everyday browsers on the client side. Portable platforms like Palm and WAP-enabled phones will probably be a client platform that is being used frequently.
What tools -open source or commercial- are available to accomplish this?
The elements of the system are:
- XML Enabled Database system: Data is retrieved by the transcoder using HTTP or your favorite protocol
- Transcoding gateway: should translate the XML data using XSL (or another way) to a form readable by the client. The exact translation or the XSL to use can be set by the server (included in the XML source), or be detected by the gateway.
- Browsers of all colours and kinds.
XML is the wave of the future, that's for sure... But what tools are available to actually incorporate XML in a system that can do all things we poor webdesigners dream of?
All suggestions welcome! "
Why XML is GOOD!!! (Score:1)
Standard formats needed... (Score:3)
Of course, that won't happen, we'll all make our own stripped-down, human-readable versions, with big gaping flaws, until someone either standardizes it, or hides something nasty and binary with a GUI and dominates the market (*hint* I wonder who wants to use XML and "open standards"....) So let's try to come up with a real open format now, instead.
---
pb Reply or e-mail; don't vaguely moderate [152.7.41.11].
Mino XML parser (Score:3)
I'm working on XSL support (so people can easily say what XML tags should become in HTML), so that should be done in the (hopefully) near future. For now, feel free to download the latest alpha and play with it.
In the near future, I plan to have support for databases, CSS, XSL (as mentioned above), and a few other XML-related technologies.
People familiar with C/C++ should easily be able to write custom modules for converting from XML to HTML using the library by looking at the examples in xmlhandlers/. Anyone want to help develop this?
XSLT is a Great Idea (Score:2)
In the OSS arena, the best example of XML on the server=>HTML (or for that matter anything else) on the client is Cocoon [apache.org]. I played around with Cocoon 1.x a little bit and it's very impressive architecturally, but even the principals agree that the performance isn't there yet. I am eagerly awaiting for Coccoon 2 though
engineers never lie; we just approximate the truth.
Related: Client-side data on demand? (Score:2)
This was the theory anyway...has anybody heard of such an implementation, or does anybody know if it is in a future spec?
One application (which is badly needed on the web, I think) is a dynamic collapsable tree. Imagine if you will a SlashDot comments page (not to hard, as you are looking at one!). Now, instead of getting a page-full of comments that take a healthy amount of time downloading (depending on your threshold settings): imagine clicking on a message to expand more comments in the thread which are fetched dynamically. You could resort, change moderation thresholds, and lots of other nifty dynamic operations without having the server do all the work.
-AP
On the browser (Score:3)
In the meantime, there are some Java Servlets out there to do the transformation on the server side. The server will grab the XML and XSL file, do transformations, and output HTML (or whatever format) to the client. I haven't played with them enough to recommend one as being particularly better, but there's some handy stuff out there.
----
Uses of XML in the real world... (Score:2)
In our case, this meant we had to find a way to store that hierachical information, which is vital to the front end, in an intermediate format that did not put load on the database itself.
The reason for that, of course, is that when you're running a distributed application to potentially thousands of clients, you want any database hit to be as few, fast and clean as possible.
That means we can't sustain connections to the DB.
That means we have to use disconnected record sets.
Disconnected recordsets don't hold hierachy information, and that means that we have find some other way of hitting the database once, getting enough data to build the hierachy externally, then shutting down the DB link.
XML provides the functionality we need to parse a flat recordset back up to a hierachical structure, without hitting the database again. It also has the added bonus that when it comes to presenting the front end in a browser, we can feed it directly to the browser if it's "XML compliant" (IE5, though there is a patch for IE4 [microsoft.com]).
B.
PS: You'll also find that XSL can do similar things to your XML as CSS does to HTML
server to server / business to business (Score:2)
The widget order fulfilment organization has a server that speaks XML over HTTP. We created a widget on our server to talk XML over HTTP to it. Instead of spending weeks to work out how to communicate with some proprietery server in proietary format we spent a few days interfacing our servers.
XML = server to server / business to business killer technology
The consumer may someday directly use XML but I don't see that coming soon on a broad scale. HTML (with Java, Javascript, CSS, etc.) will (IMHO) be the way consumers work the web for the near future.
Of course, I could be wrong.
XML FAQ (Score:4)
Server side solutions: Exeter XML Server (Score:1)
Beware XSL (Score:2)
Looking at any non-trivial XSL stylesheets, you can see what a generally bad idea it is.
My advice would be to use a real programming language with DOM bindings.
XML.com has a good article regarding XSL:XSL considered hamrful. [xml.com]
Note that XML.com also has some pro-XSL articles listed, but they aren't nearly as persuasive.
The bottom line is that the W3 "ordained" XSL to be part of the grand scheme of things, although the technology hasn't been developed in response to any particular problem.
Seems fairly easy to me... (Score:1)
So then, if you intend to use XML to store the data, and XSL to format it, the only part of the equation left is determining which stylesheet applies to which requesting client. I have no experience with XSL (I use XML for machine data, not for human eyes) -- is it possible to determine in the document which stylesheet to use? If so, it's just a matter of writing all the stylesheets.
Of course, this all depends on everyone understanding XML and XSL. If people insist on using legacy clients (like non-XML compliant web-browsers *cough*Netscape*cough*), then is a need for "transcoders" to do the XML/XSL interpretation and spit out HTML (|| HDML || whatever) that works with that client.
P.S. If you want applications of XML, look in the b2b e-commerce world. I'll avoid the direct plug and not name the company I work for, but the whole industry is based on XML.
XML is perfect.. (Score:1)
just in case you handt heard...
Yes but surely only for READ-ONLY resultsets. (Score:1)
Perhaps you should take a look at M$ ! (Score:2)
Believe it or not, the open-source bug has biten M$ !
Look into M$'s sponsorship of the Schools Interoperability Framework (www.schoolsinterop.org) and maybe you can see how M$ plans to use XML (and its derivative) in real world application.
XML (Score:2)
The downside is that the solution will involve extra processing steps, extra stuff to be implemented, and impose on you a development model that might not always be convenient (not everything wants to be a document, or a conversion or transcoding between document formats).
However, there are many cases where XML is the only viable solution, and in those cases you're just glat you can solve the problem at all! A typical example is when you have documents coming from multiple sources, and you publish them to multiple targets. It's easy to see what the XML solution would look like--but the problem doesn't even fit into the other ways of doing things.
With WebMacro [webmacro.org] a common implementation strategy is to drop key XML objects into a template that is otherwise created through ordinary WebMacro HTML template gunk.
The advantage of this approach is that you can create the bread-and-butter stuff like shopping carts, authentication, login/logout, using ordinary Java servlet code and templates. (These things are nasty when you try and force them into a document model).
Then in the middle of your page somewhere you have your XML document, rendered using XSLT or something. You have other targets, besides your servlet, where you publish that same XML document, so the whole thing winds up being a rather pleasant mixture of two different programming paradigms.
Again, the key insight in this strategy is that you use XML for the parts of your problem where it is the only viable solution--and you do everything else the normal way (without the extra costs imposed by XML, since you don't need the extra power).
I worked in an SGML shop for a couple of years, and became smitten with SGML/XML. I set out to do absolutely everything I could in SGML/XML for awhile, before realizing that a traditional template tool (like WebMacro [webmacro.org]) was far more useful for typical bread and butter servlet programming.
I still use XML a lot, but now I use it intelligently, where it's needed!
XMLTP (Score:1)
Re:On the browser (Score:2)
The current situation however is that there is a plethora of browsers, which is growing rapidly, with big differences among them, between OSes and even between versions...
To develop a number of websites, one can simply not assume that users will have a specific browsers on a specific OS of a specific version...
We might evolve to a pure XML/XSL/CSS browser eventually, but until then, there has to be a different solution that can serve today... You would be amazed how much people still use Netscape 3, just because they don't have the urge to upgrade...
Java servlets are a technique, but again: it's built into the server. There are a number of servers out there, that don't have these servlets, so that another solution would come in reaallllyyyy handy.
Okay... I'll do the stupid things first, then you shy people follow.
Re:Yes but surely only for READ-ONLY resultsets. (Score:2)
I agree that that would be very very ugly, but then, I also don't think that a system should necessarily be trying to provide concurrency on the client side, especially if the client base is expected to be extensive.
In this case, as you say, record locking and concurrency handling problems would all but preclude the use of anything but the most 'beefy' RDBMS's.
In my case, perhaps I am lucky in that user interaction is not 'live', but transactional. I just present some output, and wait for the user to respond in whatever way. Once that response comes in, I have a heap of middle-tier business logic handling exactly what we should do with it.
Record locking and such issues are dealt with at that level, rather than in the backend.
And yes, I do believe that SQL Server could handle such a solution, coupled with MTS and perhaps using a little DCOM
In any case, transactions can do nothing but help the cause
Re:XML (Score:1)
There is indeed a lot of work to be done on a framework from which one can develop these applications, but its components should be recycled very easily, especially when your company's main business is developing on-line applications ;-)
Okay... I'll do the stupid things first, then you shy people follow.
Re:Related: Client-side data on demand? (Score:1)
XML and XSLT are the way to go (Score:3)
As far as IBM's product goes - once you drill down into the technical details, it looks very much like cocoon. Interestingly enough, some of the closed source components that IBM's product relies on were donated a few months back to jump start the xml.apache.org site (namely, the XML4J parser and the Lotus XSLT processor). The main thing that IBM seems to be offering here is its 'transcoder' technology - which may be interesting and certainly bears investigation, but for my money, you're better off checking out (and having a voice in the development of) the open source apache projects.
Re: more *very* useful uses of XML (Score:1)
Anyone who says "XML is one of these words everybody's talking about yet no-one really knows how to use it in specific applications or server technologies." has probably not noticed the whirlwind of activity (including many bona-fide commercial ventures) surrounding XML.
Hundreds of site today buy syndicated news from central sources (iSyndicate.com and newsreal are two that come to mind) and receive their news feeds via XML. Also, check out webmethods.com -- here's a phenominally successful company whose entire business model is based on XML-enabling businesses.
i'm workin' on it, dammit. (Score:3)
xml rocks. every piece of online information should be in xml. usability on the web is horrible right now. the fact that search engines and yahoo-style directories are the main entrances to the web is horrific. the fact that google can't find me a single page on gkrellm (a kick-ass system monitor for linux) pisses me off to no end when i'm bored with my current skin. with everything in xml the extraction of data would be much simpler and therefore the interfaces to the web would be much more effective.
the current problem is that
i'm working on a solution and need help...so it's actually pretty smooth that this article came out in ./ at this point.
in a huge blow to problems #1 and #2 above (as well as quite a few others), i am initiating the creation of Uberbia, the most open source of web sites. the backend is zope, which is a tres cool open source web application environment which can conveniently output its internal data as xml. what this allows is for information to be created in zope and stored in zope's native db format and served up as web pages (for instance) quickly, but then also output as xml. problem #2 solved. and when browsers can handle the xml...shove it out that way.
zope also allows for information to be very easily created and shared. this is one of the main goals of Uberbia.
the idea for Uberbia was born out of the fact that the Open Source community has been living in an environment of relatively closed content management on the internet. Sure, one could create a web page and post a HOWTO they just wrote. And then post a message to a relevant mailing list letting everyone know that resource is available. And then submit the HOWTO to the LDP and wait for it to be approved and posted on the LDP page. Uberbia will remove a lot of this hassle and allow the Open Source community to easily create and manage it's content. and the data will go into an xml-aware application. problem #1 solved, at least for the Open Source community. well, okay...so i'm still workin' on it, but it'll get solved, dammit.
on trying to figure out what i was talking about, Ethan (a friend and to-be-developer of Uberbia) wrote:
sounds to me like you want to build an open-content information space. am I totally off-base? Bring "source" up to the next level of abstraction? Collaborative environments of information?
yup. he gets it. but the possibilities that arise from having such a body of contributors and open content in xml are insane. for example, imagine turning on a "newbie" feature in Uberbia that automagically inserted links to the proper entry in the jargon file for every word that was defined there. not difficult with zope and the data in xml
so, essentially i'm responding to this ask slashdot question by calling out for help with an open source project that wants to solve this problem and others. some work has been done, but there's a lot more to do. sourceforge is graciously both hosting the development of this and hosting the project itself. if you are interested at all in the development of something like this or have some really smooth-ass ideas, let me know [mailto] or join the mailing list [sourceforge.net].
i hope some of that made sense.
word, Uberdog
Re:Related: Client-side data on demand? (Score:1)
Maybe I'm just graitutuous waster of bandwidth but I'm starting to lean heavily towards writing proxy that _automatically_ grabs all links' content on a page when I hit the main page itself.
IE5 already does XML+XSL (Score:2)
It isn't too bad, either.
If no XSL stylesheet is applied then it displays the XML document using a "TreeView" default style sheet.
Also, because the XML parser & XSL thing is COM based you can use it in any language that supports COM - like Javascript/VBScript/ASP. I hate to be a MS lover, but unless you go to Java there isn't much that can do it better than that.
The new XML parser that comes with Win2000 is supposed to be 5 times faster, too. See MSDN [slashdot.org].
As far as I know there is no support in IE5 for XML+CSS. I may be wrong, there, though.
Re:Mino XML parser (Score:1)
But that is true for any web based system (Score:2)
You couldn't do it with HTML, either, could you?
Any server that uses stateful connections like that is going to have to be big & powerful.
You're looking at the problem the wrong way (Score:3)
XML really doesn't change any of the domains EXCEPT the presentation domain. You don't need an XML enabled DB, as you NEVER want to have the outside world talking directly to your DB. XML (combined with HTTP or whatever else) is one way of presenting your application. The various transforms that you would do using XSL are just "aspects" of the same presentation. So this doesn't completely change the way you build applications, just how you do your presentation.
I've written more than a few apps that were available both as GUI applications and web servers. Both versions shared the same code base up until the last layer.
As far what you need to do an XML system, I think it's a lot like an existing HTML system. With HTML, you need a database server, an app server, and a web server for an HTML system. The web server is normally scripting enabled so you can do handy transforms with the raw data.
With XML, it's basically the same concept, except your "XML server" needs to be using XSL to script transforms of the XML data. What we currently don't have is a very good way of doing this. Ideally you'd actually want the CLIENT to do the transforms as the XML data is usually much terser than whatever the XSL will generate. However, nobody trusts the clients to do this, so you might as well go with the XSL engine on the server.
Some examples... (Score:3)
There are many tools available to build such a system.
To mention only Open Source projects, I could suggest using Apache JSERV [apache.org] with Apache Cocoon [apache.org] as a framework, Castor [slashdot.org] or Quick [jxml.com] to bind XML data to Java objects and a OODBMS like ozone [ozone-db.org] or a RDBMS like PostgreSQL [postgresql.org].
These are my favorites ;)
They are very powerful and highly flexible, but the price to pay is that they are rather complex to use, that you need time to get on speed with them and that you loose focus on the core techniques behind them.
To try to get a good understanding of these core techniques, I have set up some simple examples showing how one can bind XML documents into java objects, store these objects in a OODBMS and use them in a XSLT sheet both in standand alone mode or as a servlet.
These examples are available on our web at http://downloads.dyomedea.com/java/ [dyomedea.com] and a mailing list [egroups.com] has been created to exchange and discuss such basic tips.
Hope this helps.
Eric van der Vlist
Re: more *very* useful uses of XML (Score:1)
Re:Mino XML parser (Score:1)
Re:Beware XSL (Score:3)
I wouldn't write off XSL on the strength of that article at xml.com...
When I first looked at XSL some months ago, I thought that it would be a messy and difficult language. I was wrong. XSL, IMHO, is the right solution for translating XML into pretty much anything. Yes, it does have a steep initial learning curve (much like our favourite OS
As for non-trivial XSL stylesheets? On our project, we have written XSL to transform our XML data into binary outputs. The stylesheets used ran into tens of thousands of lines! I think that qualifies for non-trivial in anyone's book. I admit that the XSL is difficult to read, but show me any source that is easy to read when >10k lines...
XSL as a complete solution? No. Even in a relatively simple XML to HTML documentation tool I wrote, I called the XSL from a JavaScript app that handled things like file access and other helper functions. This was under Win2k, using the built in script engine to call the XSL via COM. (yes, even MS get's things right sometimes) The point is that XSL is better for tranforming XML than trying to use a DOM-manipulating language binding...
On another note, why does everyone assume that XML is solely for exchanging data on the web/net? I've used it for documentation, log files, test cases, application persistence and application exchange formats. It's a lot more useful and flexible than people think.
XML Script (Score:3)
You might like to check out this page [xmlscript.org]. One of the things they have is an interpreter (X-Tract) that reads a template (written in XML!) and performs pretty much arbitrary transformations on XML input data based on this template. Looks pretty cool and simple to use. X-Tract is free for download. Funny I didn't find any info on license terms though.
I tried doing some very simple stuff with the Linux version, and the only complaints I have are:
XML and MetaHTML (Score:3)
like programming designed to emit HTML (it
was developed before XML was invented). It
was developed by Brian Fox and myself when
we had a company called Universal Access (ua.com). MetaHTML
is superior in some ways to XSL, because it is
more a general purpose programming language, yet
it's evaluator does a lot of the work of parsing
XML syntax expressions. We used to use it
to do many XML-ish things, such a generate the
MetaHTML documentation automatically from a
structured representation in the database.
MetaHTML has also been under GNU public license since about 1996.
Shameless plug (Score:1)
Grr.. (Score:1)
Re:Grr.. (Score:3)
Re:Shameless plug (Score:1)
Otherwise, nice work -- keep going!
Re:Mino XML parser (Score:1)
This is not a troll, I seriously don't know and want to.
10,000 line stylesheets (Score:2)
This is supposed to be good? Something is horribly broken. Perhaps a different tool would be more appropriate? How about a parser generator? (see Jikes [ibm.com])
Re:Shameless plug (Score:1)
Don't know why this comment got mangled the first time, it did look right in the preview...
Otherwise, nice work -- keep going!
Re:Grr.. (Score:1)
We already do this. Our website is live. (Score:4)
I work for AssureSoft whose AssureWeb website is live (work out the URL for yourself, it's not obscure but we don't want to be slashdotted). The site provides financial information to subscribers. You have to have a username and password to get the full range of services- we dole out passwords free to British independent financial advisors.
Our first XML-based service is a quotations system which allows users to get a quote for a pension or mortgage from a wide range of companies in real time (typically 5-20 secs).
Why we needed XML
Our problem was that each company had a slightly different way of asking for customer details. We decided to create an XML data type definition, now adpoted as industry standard by UK financial standards body Origo. This standard means that we can present pretty much the same input form, with a few optional extras, for any financial product.
The main use of XML is in passing the input data from our web server to the companies' quotes servers.
Layer 1: Client Browser
Layer 2: AssureWeb server
Layer 3: Company Quotes server
The XML goes back and forth between layers 2 and 3. We compile standard CGI GET/POST client requests into XML on the webserver and fire them at the quotes server. The quotes server fires back a response as XML again, and we parse this and present it to the client as a standard HTML web page. There is no XML on the client side.
Provided the company quotes server conforms to our XML standard, we can use that server for quotes. Adding new products or companies becomes a lot easier- typically we can go from scratch to beta with a new product within days. Previously it would have taken many months to write and test each individual product. XML allows us to re-use both code and input/output standards to a level never seen before.
Our next step will be a comparative quotes service. Users will be able to enter one set of data, and fire it at multiple companies. They will then get back multiple quotations, from which they can select the best based on their criteria. Effectively we will be having multiple concurrent layer 3 transactions.
--
We already do this. Our website is live. (Score:2)
I work for AssureSoft whose AssureWeb website is live (work out the URL for yourself, it's not obscure but we don't want to be slashdotted). The site provides financial information to subscribers. You have to have a username and password to get the full range of services- we dole out passwords free to British independent financial advisors.
Our first XML-based service is a quotations system which allows users to get a quote for a pension or mortgage from a wide range of companies in real time (typically 5-20 secs).
Why we needed XML
Our problem was that each company had a slightly different way of asking for customer details. We decided to create an XML data type definition, now adpoted as industry standard by UK financial standards body Origo. This standard means that we can present pretty much the same input form, with a few optional extras, for any financial product.
The main use of XML is in passing the input data from our web server to the companies' quotes servers.
Layer 1: Client Browser
Layer 2: AssureWeb server
Layer 3: Company Quotes server
The XML goes back and forth between layers 2 and 3. We compile standard CGI GET/POST client requests into XML on the webserver and fire them at the quotes server. The quotes server fires back a response as XML again, and we parse this and present it to the client as a standard HTML web page. There is no XML on the client side.
Provided the company quotes server conforms to our XML standard, we can use that server for quotes. Adding new products or companies becomes a lot easier- typically we can go from scratch to beta with a new product within days. Previously it would have taken many months to write and test each individual product. XML allows us to re-use both code and input/output standards to a level never seen before.
Our next step will be a comparative quotes service. Users will be able to enter one set of data, and fire it at multiple companies. They will then get back multiple quotations, from which they can select the best based on their criteria. Effectively we will be having multiple concurrent layer 3 transactions.
--
Standard formats are NOT required. (Score:1)
http://www-4.ibm.com/software/developer/library
Joe
Re:Shameless plug (Score:1)
Licensing isn't mentioned on the site because it hasn't yet been fixed. It is currently not open-source, although X-Tract is and will remain free for non-commercial use. Other aspects of licensing very much depend on how we sell XML Script server apps to commercial companies, and that's yet to be decided.
As to the _data file="http://...", that's a bug - could you please send this to support@xmlscript.org so we can get to the bottom of it?
Re:XML Script (Score:1)
We're not sure why the URL retrieval isn't working (it's another library that does this for us, so we'll have to look into it further) but can you send us a bug report on the CGI thing? support@xmlscript.org [mailto] is your best bet.
We know the documentation isn't the best in the world, and we have been working on it lots for v1.1 (which you should expect to see later this week). Thanks for the feedback.
IE5 XSLT is not standard. (Score:2)
Do not assume this to be a case of embrace & extend. Microsoft just implemented XSL before the spec was finalised. They say they will bring out a compliant version soon.
Re:Mino XML parser (Score:2)
One use for XML is that you can develop entire sites using your own tag set instead of HTML. For example, if you want to represent a list of books in HTML, you would probably setup a list of items. In XML, you can do:
<book>
<name>Some Book</name>
<author>Some Author</author>
</book>
...
Which is much easier to understand. Using XSL (a stylesheet language for XML) or a parser built specifically for your tag set, that <book> tag and its subtags will actually mean something.
Writing your entire site in XML has other advantages. For example, let's say you have 100 pages on your site, all written in HTML. Now you want to change the layout of the entire site. You would have to modify the HTML of all 100 pages. If all those pages were written in XML, however, you would have to modify only one file, the XSL stylesheet.
XML also has support for namespaces. A namespace (in XML) is a group of tags. Each namespace has a URI. For example, the upcoming XHTML 1.0 namespace is http://www.w3.org/1999/xhtml (that link does not actually exist though). Namespaces are very useful. If you were writing a document in XHTML and wanted to include tags from your own tagset, you would call in your namespace, and you would suddenly be able to use your own tags.
My parser will have XSL support soon. For now, you can write the modules in C/C++ and the parser will load them automagically using the namespaces and parse the XML.
I have a few articles/tutorials I've written over at gelicon.com [gelicon.com] on XHTML, XML, DTDs, and namespaces. Hopefully they will offer a better understanding.
Re:Uses of XML in the real world... (Score:2)
XML is just a consistent way of presenting information, not some major enabling-technology.
There is XSLT and then there's XSL:FO (Score:1)
XSL Transformations:
Transforms any XML document type into another. This can include HTML if it is well formed e.g. XHTML. In reality, it really is not just for "Stylesheets" but can also be used for data to data transformation. The W3C have published a recommendation (their version of a standard) and there are many implementations.
XSL Formatting Objects.
Formats XML for print or screen display. Powerful, complex typesetting-style system, you could use the analogy "PDF/Postscript for XML". Not a standard yet, and only one partial implementation of an old working draft (FOP).
A lot of the guys criticism in that article refers to the second part of XSL, which is not what people are using, or referring to when they discuss XSL here.
I don't find the guys article that persuasive, it is full of assertions, without proving them. Most of the guys gripes are directed towards formatting objects, which is complex, but the momentum behing XSL relates to XSL Transformations.
Re:Yes but surely only for READ-ONLY resultsets. (Score:1)
InfoShark [infoshark.com]
Re:Related: Client-side data on demand? (Score:1)
Re:10,000 line stylesheets (Score:1)
Apache XML Project has many tools for this (Score:1)
By the way you're putting the problem, it seems that XSLT is the answer for your questions.
The Apache XML project has a XSLT processor called Xalan that can take care of much of that part (I haven't tested any other XSL processors yet). Just link your XML document / DOM Tree to a style sheet and you have a transformed document to the format you like.
The only reason I see that this is needed is because nowadays only IE 5 and Mozilla can work natively with XML files and linked Style Sheets (and that locks you to CSS for Mozilla), so if you plan to use XML with any other device, AFAIK, you will have to use some kind of tranformation processor. It can be used to tranform a XML doc to another XML doc, but that escapes from the presentation field.
Just take a look at their page and make some tests. They're pretty nice tools, and quite easy to work with.
--
Marcelo Vanzin
Re:Standard formats needed... (Score:2)
What we do need are tools to manipulate XML. The tools for reading and writing XML are already there. What we need next is tools to transform XML documents (the standard to specify these transformation already exists: XSL).
I think there are several initiatives in this direction. (sorry I don't have any references).
Like many people I see a great future for XML but I think the coming few years will be characterized by a lot of redundant programming since everybody will individually attempt to implement more or less the same components. It would be nice to see some reusable components on the serverside.
Learning curve of XSL (Score:1)
You create templates to match the different kind of elements, and work your way down the tree of the document. This approach allows the stylesheet to work with documents which different numbers of elements, or slightly different structure. Some problems are solved with recursion.
You can do a simple approach where you have a fixed structure document, and insert values from the XML at certain points. This works for a lot of problems.
The main problem I had when learning XSL is study material. The specifications don't function as a tuturial. I recommend http://metalab.unc.edu/xml/b ooks/bible/updates/14.html [unc.edu]. It is a version of the chapter on XSL from the XML Bible, updated for the W3C recommendation. I wish I had found it sooner (I have the book, by the way, very good).
Re:Related: Client-side data on demand? (Score:2)
A second reason you'll have less network trafic is that you don't have to put layout information in the XML files. Rather you download a separate XSL file (which can be cached). Subsequent communication consists of data only.
Microsoft has some nice demos on their site (yes I know it's propietary and all but it's there) and I think mozilla also has a few nice demos.
An interesting way to connect a DB & XML (Score:1)
Instead of converting the entire database to an XML file, which consumes a lot of resources, and has synchronization issues, this approach places an XML API frontend on the JDBC system. This creates a "virtual" XML document that other XML tools can access via DOM or SAX.
For example, they create a SAX frontend for JDBC, and use it with a SAX-based XSL tool (XT) to transform the data to HTML. So, for example, where the database encounters a column for CustomerName, the template for a CustomerName entity in the XSL sheet is triggered. To the XSL tool and stylesheets it seems as if they are accessing an XML document.
http://www.javaworld
A small warning... (Score:3)
Its slow. VERY slow.
Most XSL implementations have significant performance and scalability issues as compared to more common custom technology for producing dynamic web pages.
There's no argument that its a better technology, but I've known several commercial web sites that have spent considerable resources developing XML/XSL implementations and having to roll back the technology when they discovered they needed four or five times the number of servers to be able to use it.
Anyone know of any top-tier sites that are actually using the technology?
NNTP? (Score:2)
XML should be used where its appropriate. I'm unconvinced that client-side transformations are the right thing.
do as the ORBs do (Score:1)
This has the advantage of reusing existing (ORB) technology for new purposes, and fits into an existing ideology that many already understand.
You would put a client of one of these XML ORBs into Apache or your browser client, and be able to exchange documents and DTDs freely just as with code objects and traditional ORBs.
Or so I would hope.
Any XML/SQL mapping ? (Score:1)
Re:Uses of XML in the real world... (Score:1)
I saw a briefing from the W3C at the last Builder.Com conference and he had some interesting things to say. Specifically he stated that the best uses he could see for XML right now are long term storage and data conversation. Without XQL being anything but vapor right now, searching and parsing it on the fly is a nightmare.
Just seems to me that with an Oracle 8i backend database that burns out static HTML if you need speed would be simpler than trying to incorporate an XML solution unless you needed multiple systems with different architectures to work with the same data. Even in that case you could just parse the database data into XML pages instead of HTML.
Anyone formatting XML output for printing? (Score:1)
I'm thinking of something along the lines of Formscape which can format data for invoices and purchase orders and such.
Why? (Score:4)
Spend your time working with those tools (XML4C, expat, rxp to name a few) to create higher level tools. Don't re-implement an XML parser - I can guarantee you it will be full of obscure bugs where you didn't understand the spec, didn't understand how to cope with character encodings, or just did something wrong. This stuff, despite the XML spec suggesting that a graduate could write a parser in a matter of weeks, is hard, and experienced people (such as James Clark) have put out excellent products for all to use under non-restrictive licences. Theres even an LGPL parser already out there called libxml (ships with gnome).
If you don't believe you'll create a broken parser, see the recent XML conformance tests on XML.com.
I'd also love to see you move from a non-working XML parser to something supporting XSL "in the near future". I appreciate your enthusiasm, but the XPath spec has some tough little nuts to crack (I know - I'm cracking them right now) and then implementing XSLT from an 80-odd page spec - wow - good luck to you!
(I'm not trying to poo-poo your project, but so many people start working on stuff that's already being worked on in the open-source community that it's just wasted effort).
XML Repositories (Score:1)
Fact is, XML is great for data interchange, plugging large ammounts of standard infomration into standard forms (PO's, RFQs and other business docs) as well as putting some muscle into search engines via context based searching (via XML metadata) but there are way too many standards out there.
- BizTalk - This is the standard, open nonetheless, that MSFT [microsoft.com] is developing to standardize XML. It is an open standard, but the obvious benefit to MSFT is that they can plug Biztalk functionality right into all of their product lines for interoperability across a platform.
- OASIS's XML.org - OASIS, a non-affiliated standards body, much like W3C, set out to develop a standardized set of XML schemas and DTDs (document type definitions) however, MSFT beat them to the punch by launching their BizTalk site a day before OASIS, ahhh Microsoft, finds a way to compete even in open standards.
- RosettaNet - These guys set out to "map" all common business processes and to make an open standard for XML in the business world, but, alas, mapping entire processes takes a long time, a lot of notaeriety here, not as much substance.
These are just a few examples, there are others, but, my guess is that you'll hear the most about these folks. To make things even more complicated although these guys seem to be "competing" they are almost all members of each others' groups, in a sort of "coopition" model. So, overall, it is no wonder why the big push is for standards repositories, and related transaltion to an from various formats.
That's my $.02
Performance Issues, XSL and Available Tools (Score:1)
On performance, I really matters what kind of parser you use. There are two standard parser interfaces:
There has been a lot of argument this year over whether or not to use XSL to style XML documents. I think the jury is still out on this -- at least as far as pure display style is concerned. (There are a lot of CSS loyalists out there as well.) But XSLT as a transformation language for XML is a real winner. One of the reasons is simple but profound -- XSLT is XML and is parseable and transformable just like any other XML document. You can create a stylesheet by using another specialized XSLT sheet to transform an XML or XSL document into the stylesheet you want. This can be very powerful, but difficult to debug.
Finally, I am surprised that nobody on this site has mentioned the expat (stream based) parser by James Clark that is an almost standard part of the modules for Perl5. I am learning Perl using the ActiveState port on NT and am having a whale (camel?) of a time, and the expat parser is clean and fast and fun.
Oh, and one final note -- while there are some really useful books on XML, I suggest you keep to the basic reference type (Neil Bradley's The XML Companion is next to me on my desk right now, and there is a second edition out) and use the net as your basic resource, especially lists like XML-DEV. Things are moving way to fast.
SGML Word Processor (Score:1)
SGML or XML would seem to be perfect for an open source word processor. One of the biggest obstacles of exchanging information in business is the many proprietary document formats. It would seem that if such a program could become the standard (I know that's a big if), it could be a potential killer app for linux in the business world. Especially if it came out on linux first. But even if it didn't, the linux version could be free whereas a windows version would most likely be proprietary. And I would place far more trust in an open source application complying with standards than I would one which is closed.
I know word processing isn't fun or sexy, but its an extremely important part of computing and should receive more attention than it has.
Re:SGML Word Processor (Score:1)
Help solve the problem (Score:1)
This question is what the people on the Apache XML project [apache.org] spend more or less all their time not just talking about but building stuff. If you care, join up.
Having said that, XSLT may be magic, but "old-fashioned" solutions like PHP and Zope and plain old perl-backed CGIs (perl includes an excellent XML parser) ain't going away anytime soon.
Re:Why? (Score:1)
Re:XML (Score:2)
XML doesn't solve this problem either. Writing a different stylesheet for each browser winds up being just as much work. The key is to get all of that work out of your source code, so that it is independent of the application. You can do that by using a template system.
The IBM example has multiple sources of documents feeding multiple target formats, where those targets are diverse--not just different forms of HTML, but different media altogether. In those cases XML is a big win.
XML applications *do* exist (Score:2)
XML is one of these words everybody's talking about yet no-one really knows how to use it in specific applications or server technologies
I disagree. Check out the W3C's SVG standard [w3.org]. This is for real.
If you've ever had to muck about with all of the different proprietary flavors of vector graphics formats, you know what a great thing this will be.
That said, I personally *don't* believe in across-the-board XML standardization panacea. Some things deserve standardization, others don't.
Accountants all adhere to accepted standard accounting practices. This is what makes it possible to encapsulate their work into shrink-wrapped database products that pretty much any accountant can use. But this only works because the process is so well known.
So I disagree vehemently that business-to-business transactions, for example, are ripe for XML standardization. Why? Because who the heck is such an expert on these kinds of transactions to be telling everyone else how to do it? There's a lot of trial-and-error to go through before anyone should start proposing standards.
And remember: "You can't vote for anarchy". ;~)
The Road Ahead --- and some pitfalls. (Score:1)
My reason for going on about multi-user, record locking databases is this :- Assume you build a good web site, nice and fast and so on, used by many people. I would suspect that as, per the old adage 'No good deed goes unpunished', your boss would then ask you to build a more interactive site.
Then you realise to your horror that XML doesn't really help at all when it comes time to trying to re-mesh updated/changed XML 'data bursts' back in to the main DB.
Another thing that just occurred to me - Surely the queries needed to get the hierarchical data have to be expressed in SQL. If so, surely the cost in terms of logical/physical reads (i.e. the cost to the server of doing the queries) will be the same whether you do them all at once, to build your XML 'data burst' or whether you run them just as the user requests them.
In Oracle you can keep open connections to the server at all times (and even pre-start some at DB startup) i.e. the connection latency is very small. I think SQL Server would have to be configured to pool connections in some way. Does SQL Server 7 let you do this? Does MTS let you do this? I'm not sure.
BTW what are your feelings on MS having to delay the In-Memory Database and COM+ (component) load balancing. As I remember they had to drop them from basic W2000 Server and have said you'll get them in the W2000 Datacenter edition. It might be that without these features your DCOM and MTS architecture might run out of steam. (You might even have to tell your boss to splash out on W2000 datacenter edition as well!).
Just some thoughts.
Re:Related: Client-side data on demand? (Score:1)
Here's an interesting example:
XSL Sample [microsoft.com]
Which is from the following article:
Choosing between XSL and CSS [microsoft.com]
Of course, solutions like this for general websites aren't very appropriate yet for public websites, as they require IE 5.0. But the technology is very exciting.
There are several other examples on this site that utilize client-side XML processing to dynamically change the way data is displayed - sorting a baseball roster by name or batting averages, or even calculating and displaying statistics on the client.
Re:Related: Client-side data on demand? (Score:2)
Anyway, I don't think that the bandwidth problem is caused by either HTML or XML. The real problem is the objects that are referenced like for instance gif or jpg images and that won't change I'm afraid.
XSL can't match programming languages from the 60s (Score:1)
Does XSL encourage reuse through its syntax? No
Does XSL base its constructs on proven language design ideas picked up in the last twenty years? No
I have no idea why people are so ga-ga over a language that predates Algol-6x in its design.
10,000 line sheets necessary due to dumb syntax (Score:2)
For someone who uses a language like Python or Java, I can't imagine why they would find anything compelling about XSL. It really is a dog language. Most people are just too ga-ga over the fact that it is encoded in XML to see how lame it really is.
Thankfully, few people are rallying behind it.
Cocoon project? (Score:1)
So how do you convert XML->HTML or XML->WAP? (Score:1)
If you "already do this [convert XML->HTML or XML->WAP]", how does that work? Is it custom?
XML Summary and History -- Comments on Transcoding (Score:2)
I believe eventually we are going to get to a point where server-side transcoding will not be necessary. However, this will be several years, and we are going to have to learn how to do all of this efficiently.
I am even developing my own transcoding software process because I belive I have a better method of doing it than what is currently available. If and when I do succeed it will be closed-source because I want to make money off of my product, not just give away all my hard work.
Anyway, the next few years are going to be very interesting.
E
Well, you've got it sort of wrong. (Score:1)
This format in particular, offer no modularity or reuse features, and there is nothing about XML that strictly forbids such features.
Re:Mino XML parser (Score:1)
Re:XML Summary and History -- Comments on Transcod (Score:1)
Call me pedantic, but I have some issues with the following statement:
HTML and XML are related formats; in fact, HTML can be defined as a subset of XML.
This is a bit of a peeve of mine. HTML is an application of SGML, not a subset of SGML, and definately not a subset of XML.
A lot of stuff that's in HTML is not legal in XML, like the IMG tag and the OPTION tag:
Which is why XHTML [w3.org] was created.
Re:Performance Issues, XSL and Available Tools (Score:1)
I'd like to echo your shout out to James Clark's products. On the Java front, his XT library implements XSLT, and uses a SAX parser (which, as was pointed out, implies better performance than DOM).
http://www.jclark.com/xml/xt.html [jclark.com]
Yes, unfortunately XML is getting overhyped (Score:1)
XML only solves the problem of data formatting.
There are some doc-heads out there that are trying to wrap XSL, XQL, XPath, and some of the other proto-standards into one cohesive view of the world, but it really isn't there yet.
SQL databases are still the way to go for storage - more due to uptime and recoverability than anything else. Also, regular programming languages such as Python and Java, when used with DOM bindings are still a more powerful, efficient, and flexible solution than XSLT or XSL-FO.
Re:XML Summary and History -- Comments on Transcod (Score:1)
Re:Why? (Score:1)
I'm not saying everybody has to use this program, or that it will be the #1 XML parser. I'm just saying it's something useful I'm developing, which is also helping me learn a great many things about XML development.
Besides, it gives me something to do
Re:XML Summary and History -- Comments on Transcod (Score:1)
HTML has recently been slightly altered into the XHTML DTD.
A person can use any XHTML DTD in any XML document.
So saying that HTML is a subset of XML is not far from the truth. I am also willing to bet a person would have moderate success using a regular HTML DTD in an XML document, but it would not be worth it.
E
Re:FIRST JESUS POST (Score:1)
XML and Transcoding - How IBM would do it (Score:2)
As you hinted in your note, it can sometimes be a challenge to select the best stylesheet to apply to a given XML document. The gateway may want to choose a stylesheet based on the source document and the destination browser or device. In addition, different stylesheets may be better suited to specific user preferences or network connections. The IBM transcoding technology includes a way to select the "best" stylesheet to apply in a given situation.
The Transcoding technology can also adapt content other than XML for different clients. HTML requires special processing because you can't apply stylesheets to directly since it's not well formed. Images also require special handling to adapt them for the destination device. The whole transcoding gateway may be a separate component, installed as an HTTP proxy, or it may be configured as a servlet on the same server that is the content source.
Jabber is XML (Score:1)
http://www.jabber.org/
Re:You're looking at the problem the wrong way (Score:1)
I only partially agree. In the presentation domain, XML can be used to isolate the logical structure of the data from the HTML/WML/etc. It's very useful for this, but beware of the slowness of XSLT (as others have commented). I found that using the fastest XSLT (the jclark version [jclark.com]) it still took around 300 ms to produce about 20K of HTML from XML.
In my situation, much of the XML was static information, so I decided to generate JSP output using XSLT instead, since JSP is compiled; the same could be done with another compiled scripting lanuguage. What was most interesting to me was the problem of isolating the static parts of the page, which could be compiled in JSP, from the dynamic parts, which had to come from the database / application layer. In this case, the tag extensions in the latest JSP (1.1) are very handy. They allows the JSP file to be a well-formed XML document, and therefore easily generated by XSLT, and the extended tags can be programmed to interact with the application layer in a very clean way. The tag extensions could be programmed to either interact with an application object, or a XML DOM, although actually the latter is more cumbersome.
I agree that XML is not very valuable as a direct interface to the database -- there should always be a layer between the database server to enforce access control, implement rules, etc. However, XML is useful as an exchange format between loosely connected servers, such as in B2B interactions. In these cases it is better than using distributed objects, because the coupling is looser and easier to define. But I'm of the opinion that the XML should represent a high-level operation, not database rows.
Re: more *very* useful uses of XML (Score:1)
We have it in production (Score:2)
XML rocks. You don't need to stuff your head full of theoretical debates about namespaces, general entities, etc. All you need is vi (or Notepad) and Saxon. To learn XML syntax, just write XML files by hand and feed them to SAXON until it no longer reports XML errors. To learn XSL, just write XSL files until you get SAXON to actually spit out some HTML. Lots of examples are available to accelerate the trial and error process.
When you are finally ready to integrate the whole shebang into actual applications, there are tons of open-source tools to choose from. Look at the list above again - Apache,PHP,MySQL,SAXON - cost zero - this combo drives one of France's most popular Websites.
Re:Why? (Score:2)
If that doesn't bake yer noodles, download rxp which also does validation against a dtd.
Really, work on providing XPath and XSL support for expat - the community will thank you _much_ more for it.
Hummingbird & XML (Score:2)
Re:The Road Ahead --- and some pitfalls. (Score:2)