
U.S. House of Representatives Makes Resolutions in XML 164
RennieScum writes: "The House of Representatives is turning to technology with their test of XML for use with resolutions according to this article. It reports that the HR has made 100 DTDs and uses Microsoft Word and a special converter to do the job. Testing has begun and their goal is to start using it in January of next year. See also http://xml.house.gov/ And it looks like the DTDs will be free to use and distribute!"
Yee haw! Crappy laws in better format! (Score:2)
Was there any doubt they wouldn't be free? (Score:1)
Re:Was there any doubt they wouldn't be free? (Score:2)
Considering the current government's flirtations with Big Business (not to be confused with Big Brother), I'm actually surprised that they didn't just publish their bills as Word documents.
And looking at the XML documents, it does appear that they're using some non-W3C, Microsoft-like XML stylesheet format. I'd argue that this is favoring one commercial product (Internet Exploder) at the expense of all others.
Re:Was there any doubt they wouldn't be free? (Score:2, Informative)
Um, did you read the source? Or did you just open it up in IE? Because the source is clean (though not prettily formatted:), pure, 100% XML. In fact, there's only one namespace declaration in the entire thing (XLink, which they use to embed hyperlinks between various parts of the documents). All in all, this is some of the cleanest XML I've ever seen (including XML I've written myself by hand:)
But if you opened it up in IE, IE applies a stylesheet to all xml documents which gives you a nice collapsible view of the document tree (which is often easier to read than the source:)
Ugh. DTDs?!? (Score:2, Insightful)
Oh well... at least it's a step forward - I'll applaud them for that.
They DO use schemas... (Score:2)
Re:They DO use schemas... (Score:2)
Re:Ugh. DTDs?!? (Score:1, Insightful)
Why couldn't you just take all of their DTDs and rewrite them as schemas? You could then donate that back to them, and i'm sure they'd be happy to offer it as a download option.
Hell, maybe someone could make an XSL stylesheet to turn DTDs into schemas
-- super ugly ultraman
Schema war is not over...W3C XML-Schema is bloated (Score:3, Insightful)
Have you ever tried to use XML Schema? It's a bloated peice of ****. Relax is tons better. And for the government's purposes, DTDs work much better and are an ISO standard.
Re:Schema war is not over...W3C XML-Schema is bloa (Score:1)
Re:Schema war is not over...W3C XML-Schema is bloa (Score:2)
In this case RELAX [oasis-open.org] is far superior, it has both an XML and a non-XML represenatation and is build on top of a clean model by some brilliant fellas.
XML Schema, OTOH, is just a bloated mess.
DTD's are antiquated
Perhaps, but they are readable. XML Schema is anything but readable.
and I can't even transform against them for meta-meta-data tasks
Oh, now that's something you do every day. Using XML syntax for everything is just plain stupid. IF you have to do transforms, use RELAX, it has a cleaner model anyway... doing transforms on XML Schema is like pulling teeth.
DTD is sooo 1999. (Score:3, Insightful)
When every tool under the sun is using XML schemas, the House is announcing their support for DTDs.
I guess it's still a step forward.
Re:DTD is sooo 1999. (Score:2)
Jeezus, why would you even consider using Schemas when there is there is Relax-NG [thaiopensource.com], a much better, simply, and based on theory system. Note the author of that document I gave; it's James Clark; if you are using an XML parser, chances are good it was written by him (expat). Heck, there is not even any normative spec for XML-Scheme!
DTD may be old (Score:1)
Re:DTD may be old (Score:2)
Seems to me like it's been at 2.0 RC X.x for quite some time.
Re:DTD is sooo 1999. (Score:5, Insightful)
And then in a year or two, you'd just complain how the government cant choose their technologies right.
Start thinking about where you're getting this 'government is stupid/terrible/lazy/blah/blah' message from - alot of it is from private interests that enjoy the freedom and lack of public accountability to select their technological infrastructure based on higher demoninators than your government should. While the 'saavy' factor will always be higher in the private sector, dont *always* take this as an indication that government must be technologically inept (although, like anybody who's core competancy isn't technology, they frequently are)
It's like being a private teacher vs public. Private teachers can probably be more 'progressive', but at the cost of maybe teaching in ways that might soon be proven to be ineffectual or bad, while public systems generally must move slower in order to ensure that the ideas have been vetted and that everyone has a moderately equal opportunity to access the fruits of the system.
Like parents, sysadmins, anybody who has an onus to cater to the greater good rather than the richer good, sometimes you have to make decisions that are going to be publicly derided even if its for the common good. Sometimes you have to just give the benifit of the doubt, though I realize this kind of attitude is in short supply these days.
Ok, rant off.
MOD PARENT UP (Score:1)
Listen up kiddies.
I second that. (Score:1, Redundant)
Not the real issue (Score:1)
Re:Not the real issue (Score:1)
Not to mention in case of any major changes, it doesn't take long to create an XSLT script to convert your XML into anything.
Even HTML would be a HUGE improvemt (Score:2)
Even the stilted style of language referred to as legalese is partly a product of the need for a meta context within legal writing. This is long overdo, but awesome nonetheless.
Uhhh.... (Score:4, Interesting)
Re:Uhhh.... (Score:2)
Re:Uhhh.... (Score:2)
Re:Uhhh.... (Score:2)
<?xml:stylesheet type="text/xsl" href="member-sorter-vb.xsl"?>
<?xm-well_formed path="m:\xmltech\billres1\00-11-01\Members\mbr107
<ushousemembers xmlns="x-schema:member-schema.xml">
Stylesheet issues... (Score:5, Informative)
Re:Uhhh.... (Score:1)
I think that's because IE uses a default stylesheet for xml documents, while Mozilla strictly complies to the standard and just shows the contents of the tags, without any style.
Re:Uhhh.... (Score:5, Informative)
It's the XSLT (Score:1, Informative)
<?xml:stylesheet type="text/xsl" href="member-sorter-vb.xsl"?>
in the 6th line of the above-referenced xsl document being used to transform the xml:
<xsl:stylesheet xmlns:xsl="http://www.w3.org/TR/WD-xsl" language="VBScript">
basically, they're using the MSXML parser to do their XSLT on the client-side. I've been working with this stuff for a while, and there are a lot of advantages to doing this. The MSXML parser is a lot more mature & well documented than whatever comes built into NS6 & Mozilla(if you know better, please point me to some good resources for working with client-side XSLT on these browsers-- i've looked everywhere).
But it seems to me that public accessibility to to these documents should preclude this, and demand that the parsing be done on the server-side.
Beyond that, the fact that they're using VBScript instead of JavaScript for their scripting is indicative of the fact that the people in charge of this initiative are hardcore MS-Heads -- ther's no reason for it, you can do some extremely complex stuff with the MSXML parser and JavaScript.
I know this is paranoid, but my past experience has been that even people inside MS use JScript if they can avoid VBScript... unless they're forced to use it for marketing reasons. Wonder who's in charge of this initiative.
Re:It's the XSLT (Score:1)
IIRC, the ASP pages on microsoft.com use JScript; VBScript is great because if you know VB, you can learn VBScript in an hour.
Re:It's the XSLT (Score:2)
Re:Uhhh.... (Score:2, Informative)
See Unofficial MSXML XSLT FAQ" [netcrucible.com] for some info about the old Working Draft, XSLT 1.0 and Internet Explorer.
I get this in Netscape 7 Preview: (Score:2)
I can't post it because of this error:
Your comment has too few characters per line (currently 6.2)
Check this with IE though: (Score:2)
http://xml.hou
http://xml.house.gov/hr10.xml
all just code
Re:Check this with IE though: (Score:1)
Re:Check this with IE though: (Score:2)
Re:Just use IE6 (Score:2)
Re:Just use IE6 (Score:1)
IE runs on real Unixes, like Solaris and HP-UX. Grow some pubes, take a shower, and get a life.
Re:Just use IE6 (Score:2)
2. IE support on the few unixen where it does run is awful and the thing is too bloated to be practical (since instead of porting IE to unix APIs they ported parts of the Windows API and put IE on top of that, the executable is gigantic on unix.)
3. You did say "IE 6", which even on the few unixes where IE 6 exists, it doesn't go up to that version number, so clearly you are lying.
Re:Just use IE6 (Score:2)
Re:Just use IE6 (Score:1)
How Slashdot-like (Score:5, Funny)
Re:How Slashdot-like (Score:1)
> the government is too behind the times because that don't use new (and better) XML schemas!
Well, this is an administration, you know... So actually they can be credited for having been aware of XML at least a year ago. Had they been aware of XML schemas that it'd have taken another 6 months before the site got up, don't you think?
I'm quite confident that nowadays the average PHB doesn't even know what XML stands for and is used for...
DTDs (Score:2)
But if they really want an intractible problem, they should use XML/Schema!
Oh Boy! (Score:1, Offtopic)
Free DTDs!!! I LOVE DTDs! Wooohoo! We definitely don't have enough of those already!
And who says a Republican government is only out to help the big guys. Free DTDs for all!
Happy 4th everyone! Damn I'm proud to be an American today. Free DTDs!!
-Russ
Re:Oh Boy! (Score:1)
Lawmakers who don't understand the law (Score:4, Interesting)
Pursuant to Title 17 Section 105 of the United States Code, these DTDs are not subject to copyright protection and are in the public domain.
These DTDs can be redistributed and/or modified freely provided that any derivative works bear some notice that they are derived from it, and any modified versions bear some notice that they have been modified.
Sorry, cupcakes, that's not how the public domain works. If you release it into the public domain, you no longer have *any* control whatsoever upon the modification, reuse, or redistribution of the work. The required notice clause listed above in invalid.
Cite [stanford.edu], cite (#3) [templetons.com], cite [ufl.edu].
Kuroth
Re:Lawmakers who don't understand the law (Score:1)
Re:Lawmakers who don't understand the law (Score:1)
Re:Lawmakers who don't understand the law (Score:2)
hmmm makes me think I want to do that with all my documents. Is there a license attribute for meta-data tags in html... if not I'll make one.
I say... (Score:2)
My $0.02
Example of the new markup (Score:5, Funny)
<bill status="proposed" name="CBDTPA">
<sponsor name="Fritz Hollings" constituency="Disney">
<violatesAmendment number="1">
<violatesAmendment number="4">
<contribution donor="Disney" amount="24500.00">
<contribution donor="AOL" amount="33000.00">
<contribution donor="National Association of Broadcasters" amount="25000.00">
<excuse>Promote broadband adoption</excuse>
<excuse>Save the arts from extinction</excuse>
</bill>
Re:Example of the new markup (Score:2)
Thats the best part! I always hated that excuse, especially considering how insulting it should be to artists.
Stop and think about this - claiming the arts will die if hollywood dies is like saying the habit of breathing oxygen will die if the SCUBA industry goes belly up.
Indeed, it's not free (Score:3, Informative)
XML is dependent on unicode, as the US Government site's reference states. Follow the W3C [w3.org] to unicode ,
Unicode is required by modern standards such as XML, Java, ECMAScript (JavaScript), LDAP, CORBA 3.0, WML, etc., and is the official way to implement ISO/IEC 10646.
Unicode is owned by Unicode Incorporated [unicode.org] and all of it's documents and standarts are issued under a restrictive license [unicode.org] with a unilaeral change clause:
Modification by Unicode Unicode shall have the right to modify this Agreement at any time by posting it to this site. The user may not assign any part of this Agreement without Unicodes prior written consent.
Dare I compare this evil arangement to ASCII and other predecesors? To have IBM, M$, Sun and other OWN the very format your data takes and to be able to change it and break previous implimentations at whim, and YOU may not? Who wants to be a plump nickle that any thing vaugly resembling unicode in the future will be called a "derivative" and it's distribution halted? Is this not a collusion of comercial software vendors to control information at it's most basic representation? Does anyone else here see this as the ultimate extention of copyright? Evil, Evil, Evil.
I'd rather see the US government continue to publish in the American Standard for Information Interchange. This extensible standard is no standard at all.
Why didn't they just use standard HTML? (Score:2)
Standard HTML is just as searchable as long as you use the tags properly. One does have to wonder if M$ "encouraged" them to use this format.
Re:Why didn't they just use standard HTML? (Score:2)
Re:Why didn't they just use standard HTML? (Score:2)
What is this mysterious data that can't be expressed in HTML???? Blipverts [techtv.com]!!!??!!?? Maybe they'll put [w3.org] cartoons [w3.org] into the bill--to help explain why they passed it. Oooo...maybe they can put in complex equations [w3.org] so everyone will think they are smart [imdb.com].
I think some people just believe XML is some sort of magical file format that should be used no matter what. I expect MPEG 5 will be in XML, then they'll wonder why the files are so much larger and takes 10x the processing time and memory to decode.
XML may be useful in some places, but not everywhere. Replacing it with binary formats is bad because it will unnecessarily increase the filesize and resources to decode them. Using it for config files will require all programs to run an XML parser and make the config files less human readable. Using it to express laws will just make them inaccessible to the common person by requiring them to have expensive proprietary software (or software made by an illegal monopoly) to even view them.
If they want bills to be searchable, they should be designing database tables for them, and allow the public to export the database (or subsets of it) in a standard database format. For online viewing, they could easily export the data into HTML (or XML) using PHP.
Using "Microsoft Word and a special converter to do the job" is just stupid. Creating a program that allows some intern to key the data into the database would probably be easier and more effective in the long run.
Re:Why didn't they just use standard HTML? (Score:2)
Oh yeah, just make up some contrived obviously biased answer! Do you make infomercials???? Or maybe you just don't know anything about html.
The html version of your "example" would probably look more like this:
<p><a name="para1">(1)</a> blah, blah, blah
Re:Indeed, it's not free (Score:1)
Re:Indeed, it's not free (Score:2)
Unicode is owned by Unicode Incorporated [unicode.org] and all of it's documents and standarts are issued under a restrictive license [unicode.org] with a unilaeral change clause:
Have you looked at the copyrights for most standards? Try to get a free copy of the SGML or EDI standards? Unicode is wide open comparitively. Plus, if you're going to complain about vendor-owned consortia, you might as well whine about the W3C itself.
Re:Indeed, it's not free (Score:3, Insightful)
Settle down, they're not trying to use MSXML engines to do the work. Sheesh.
Re:Example of the new markup (Score:2)
Re:Example of the new markup (Score:2)
Re:Example of the new markup (Score:1)
someone could have a metaserver which puts these
additional tags into the offical descriptions of
the bills.
each could have links to the sponsoring groups of
lobbists or grassroots which in turn could be
crosslinked to show which bills are obviously
just kickbacks, and which are really concerned with
issues.
Re:Example of the new markup (Score:3, Interesting)
Just write a http proxy that applies an XSLT to the document. Generate the tag-values from the opensecrets.org database (if they have one).
Could probably be done by one person in a week or two, if opensecrets keep a reasonable usable database, and are willing to cooperate.
If I were an american I would be tempted to write the thing myself...
It would be great to just go to a website and see all bills with a header that indicated which elected officials was involved, and their voting record and ties to special interests.
Hell, if anyone wants to do this, I am willing to contribute just because it's cool...
They are using WordPerfect Too (Score:3, Informative)
The article actualy says It shows how each line, name and term has an identifying tag, created by exporting the document from a word processor such as Microsoft Word or Corel WordPerfect into a special XML template.
That would make sense since most of the US government still uses WordPerfect [corel.com]. WordPerfect comes with extensive XML publishing functions including making your own DTDs.
BTW Corel just announced that a new version of Ventura Publisher is coming out in the fall with cross platform XML publishing built in. The next version of WordPerfect is also going to have a much better XML publisher now that they bought XMetaL [corel.com].
don't even validate (Score:2, Interesting)
value of attribute "regeneration" cannot be "yes"; must be one of "yes-regeneration", "no-regeneration"
save a buck or two (Score:1)
Re:save a buck or two (Score:1)
DTDs, Schema, and XDR (Score:4, Informative)
Re:DTDs, Schema, and XDR (Score:2)
In what sense is XDR "forwards compatible" with XML Schema? In the sense that you can rewrite all of that Microsoft-proprietary stuff into XML Schema if you care to put in the effort?
Re:DTDs, Schema, and XDR (Score:2)
Re:DTDs, Schema, and XDR (Score:2)
I was speaking with a Microsoft employee on the Schema team today. He reacted in horror to the view that XDR is "upwards compatible" with XML Schema.
Re:DTDs, Schema, and XDR (Score:2)
Great! (Score:5, Funny)
Great, now I can make my own crazy laws! Yipee!
Re:Great! (Score:2)
Actually it's so that lobbyists [riaa.org] can make their own crazy laws [eff.org]. Yipee, indeed.
Re:Great! (Score:1, Funny)
Another Use for Microsoft crap (Score:3, Insightful)
Re:Another Use for Microsoft crap (Score:3, Funny)
Is it still biased in favor of IE users right now? Absolutely, I won't deny that. But if it is actually a properly documented format for once then that bias won't last. This isn't a perfect situation, but it's a major step up from publishing things in proprietary binary word processor formats like they did in the past.
What part about public domain don't they get? (Score:5, Insightful)
Either these DTDs are copyrighted and they can place restrictions upon distribution or they arn't. This need people have to control everything is just driving me crazy. The whole reason for Title 17 Section 105 is so that the Government can't put restrictions on this kind of stuff (bills, laws, etc.)
Finish the job. (Score:1)
Re:What part about public domain don't they get? (Score:2)
Re:What part about public domain don't they get? (Score:2)
And how could I possibly steal something that is in the public domain? Just beacuse they wrote it they own it? The framers of the consitution rejected natural-rights thought with regard to intellectual property. Who owns it anyway? The public of the U.S. paid for it, so don't we own it? If I copy it and use it for my own purposes why would this make me a thief?
I think you have fallen into the group-think that the RIAA wants everyone to succumb to.
Re:What part about public domain don't they get? (Score:1)
Re:What part about public domain don't they get? (Score:1)
So does this mean... (Score:2, Funny)
ddt free to use? huh??? (Score:3, Insightful)
Ummmmm if you're using a validating xml parser, you HAVE to have access to the dtd!!! All DTDs have to be free to use!
Happy 4th! (Score:1, Offtopic)
<?xml version="1.0" encoding="ISO-8859-1" >
-<Flags>
-<Flag type="American">
<symbol type="Stars">
<count>50</count>
<background>navy</background>
<color>white</color>
</symbol>
<symbol type="Stripes">
<stripeno=1>
<stripeval>Deleware</stripeval>
<color>red</color>
</stripeno>
<stripeno=2>
<stripeval>Pennsylvania</stripeval>
<color>white</color>
</stripeno>
<stripeno=3>
<stripeval>New Jersey</stripeval>
<color>red</color>
</stripeno>
<stripeno=4>4</stripeno>
<stripeval>Georgia</stripeval>
<color>white</color>
</stripeno>
<stripeno=5>
<stripeval>Connecticut</stripeval>
<color>red</color>
</stripeno>
<stripeno=6>
<stripeval>Massachusetts</stripeval>
<color>white</color>
</stripeno>
<stripeno=7>
<stripeval>Maryland</stripeval>
<color>red</color>
</stripeno>
<stripeno=8>
<stripeval>South Carolina</stripeval>
<color>white</color>
</stripeno>
<stripeno=9>
<stripeval>New Hampshire</stripeval>
<color>red</color>
</stripeno>
<stripeno=10>
<stripeval>Virginia</stripeval>
<color>white</color>
</stripeno>
<stripeno=11>
<stripeval>New York</stripeval>
<color>red</color>
</stripeno>
<stripeno=12>
<stripeval>North Carolina</stripeval>
<color>white</color>
</stripeno>
<stripeno=13>
<stripeval>Rhode Island</stripeval>
<color>red</color>
</stripeno>
</symbol>
</flag>
</flags>
Note: I'm from New Mexico, so I know what it feels like when a state gets left out. Rest assurred, my flag includes Deleware!
Re:Happy 4th! (Score:1)
by misspelling *Delaware*!
HR has made 100 DTDs (Score:5, Funny)
yep (Score:1)
No, this doesn't mean you can make your own laws. =P
we need open source software (Score:1)
U.S. Senate Responds... (Score:1)
XML creaps in another place (Score:2)
The problem with XML is that it diverges into two dinstict worst cases. One requires and infinite amount of memory, the other and infinite amount of time. Both of these are bad things and much study of algorithms is about avoiding both of these conditions. Odd thing is most people in the IT field today have no clue about why this happens or even that it can happen. Of course these are the same programmers that coudn't describe a quicksort if they had to or descibe something in BNF grammar. And we wonder why most programmers today just produce garbage.
Re:XML creaps in another place (Score:2)
Re:XML creaps in another place (Score:2)
Whether it will allow you to try to recover or not at that stage would be up to the parser.
Recovering from malformed input is regardless a difficult task, and typically you don't want to go there - that's not a parsing issue, but an issue of trying to predict how an error should be recovered.
For a DOM parser, the parser would do the same thing, and just fail and free the tree once it found the surrounding tag (or the end of the file). However using a DOM parser with a scenario like the one you suggested would be plain stupid.
In either case, handling a missing closing tag is trivial with XML, and I certainly can't see any justification for the claim that you'd either need unlimited memory or unlimited time based on that
Anyway, you've just given an example of a case where ANY grammar based on nested blocks will have to have thought put into it when it is fed bad data, with no justification for why it should make XML bad from a parsing standpoint.
Do you have a better example?
Re:XML creaps in another place (Score:2)
So what you are really saying is that your problem is with ANY system that allow scoping, and where state is required for each scope until the scope is closed?
The problem with that is that scoping is useful and makes it a lot easier to represent a whole lot of data in a structured form that seems natural to humans.
In other words, an XML parser may require more resources than a parser for a grammar without scoping. But the scoping is allowed for a reason - it provides structure that is hard to provide without it.
The reason you can't make a file that breaks grep is that grep doesn't care about structure. You can easily work on XML files withouth running into the problem as well if you ignore structure. But then you are also losing a whole lot of advantages.
I still don't see this as a problem. You need to handle resource limits regardless. If you have 1MB available, as you originally used in your example, then when you have used that 1MB then you have to fail gracefully. If the only case where you use the whole 1MB is a broken document, then whether you fail because the parser detects it or fail because you don't have more memory is irellevant - the parse failed.
If you need to give more specific error messages, you can do that fairly easily, by, when you've filled memory scanning the remainder of the document to determine whether any of the outer tags will EVER get closed.
If you want to recover from unclosed tags, the standard way of doing that for HTML and XML is to define which start tags you want to autoclose which types of open tags for.
This is a straightforward mechanism that works well, in particular in the presence of a schema or DTD where you can easily determine where leaving a tag open means the document is malformed where it may possibly be wellformed if the tag is closed.
I haven't implemented it for XML, but I have implemented in an HTML filter that needed to handle particularly broken HTML.
In the real world this is a problem only if you don't think about it and design your software to handle it, just as not thinking your design through in general leads to broken software.
Re:XML creaps in another place (Score:2)
And I'm used to dealing with users on the input side. The company I work for operate the .name TLD. Registrars interact with us via XML. Our subcontractors interact with us via XML. We're dealing with far from perfect XML and errors needs to be communicated.
We did use to have an ASCII based format, and we had more problems with that. The advantage of XML is that the users can validate the XML generated pretty well on their side by running it through an XML parser with schema validation support.
Re:And now ged rid of the legacy (Score:1)
The Importance of DTDs (Score:2)
a commitment to open data formats. Even where we don't get open source code, this guar-
antees that we don't get the most virulent form of 'vendor lock-in', where failure to pay the
latest rent increase means we can't even access our own data [slashdot.org] anymore.
---
Fight Page Widening! Make your own line <br>:reaks.