Follow Slashdot blog updates by subscribing to our blog RSS feed

 



Forgot your password?
typodupeerror
×
The Internet

Netscape Dumps Critical File, Breaks RSS 0.9 Feeds 137

An anonymous reader writes "In the standard definition of RSS 0.91, there are a couple of lines referring to 'DOCTYPE' and referencing a 'dtd' spec hosted on Netscape's website. According to an article on DeviceForge.com quite a few RSS feeds around the web probably stopped working properly over the past few weeks because Netscape recently stopped hosting the critical rss-0.91.dtd file. Probably someone over at netscape.com simply thought he was cleaning up some insignificant cruft." Some explanation has been offered by a Netscape employee.
This discussion has been archived. No new comments can be posted.

Netscape Dumps Critical File, Breaks RSS 0.9 Feeds

Comments Filter:
  • Ack (Score:4, Funny)

    by Cygfrydd ( 957180 ) <cygfrydd DOT llewellyn AT gmail DOT com> on Sunday January 14, 2007 @10:32AM (#17602578)
    I would've seen this post sooner, but my RSS feed was broken... something about a 404?
  • by eurleif ( 613257 ) on Sunday January 14, 2007 @10:32AM (#17602586)
    I don't see how this would break RSS readers. DTDs pretty much never get read except by validators. Normal SGML and XML parsers just treat the DTD URL as an opaque string, not as something that can be retrieved.
    • Actually the DTD is loaded up by pretty much every proper XML library even if validation is "off".

      The DTD contains more than just the element definitions and hierarchy. Its also used to define entities (&...;) that are non-standard to XML but may be expected in the file. HTML has tons of pre-defined entities but XML only has the core 4. All others are defined in DTDs and loaded on the fly as part of the processing.

      There are ways to turn it off at the lowest levels, but higher-level abstractions/libraries might not give access to that. For example, with JAXP + SAX you can turn off DTD loading, but Jakarta Commons Digester doesn't give a setting where you can trigger that, so Digester tries to load the dtd, and even with validation off you can't change that. My only recourse is to take the DTD lines out of the various config files. (Reason: My JBoss server is deployed in private networks where the server can't reach the internet).
      • Re: (Score:1, Redundant)

        by tomhudson ( 43916 )

        My only recourse is to take the DTD lines out of the various config files. (Reason: My JBoss server is deployed in private networks where the server can't reach the internet).

        why not change your hosts file so that my.netscape.com/publish/formats/rss-0.91.dtd points to a copy on your_local_network/publish/formats/rss-0.91.dtd ... (for example, 192.168.1.96 my.netscape.com)

        • by acroyear ( 5882 )
          which "hosts file"?

          we have licensed customers with this (in fact, they're the problem - internally we don't see any problem with having outbound internet access for our servers and trust the firewalls we put up); i can't go telling their IT departments (who are sometimes completely inept) that we need to go mucking around with OS configuration files, especially in Windows.
        • by enosys ( 705759 )
          Using the hosts file to redirect accesses to the DTD would block access to my.netscape.com. Maybe they deserve it for breaking RSS and still not having fixed it, but I'm sure some users would complain.
          • True, but if all they're doing is grabbing a dtd from there, who will notice ? It's offline right now anyway ...

            My.Netscape is undergoing changes!

            We are currently working to launch an updated My.Netscape complete with all the functionality you've enjoyed in the past built upon an entirely new technical foundation. This will come with additional functionality and a much overdue face-lift. Unfortunately, such an endeavor requires us to take the current My.Netscape off line for the time being. We're sorry

      • Why don't these tools fetch the file only once and cache it? They are not suppose to change, are they? Even if the caching is just for performance reasons.
        • by acroyear ( 5882 )
          Caching it requires having access to get it in the first place, doesn't it? I can't ask them to open their system up to the 'net just to grab some file every time they restart their servers, can I?

          See my comment elsewhere in this thread - there's some "magic" that Java can do to load a local file as a resource given an external URL but I've never seen a decent example of how to set that up for my own apps.
          • Re: (Score:2, Informative)

            by Mithrandir ( 3459 )
            What you need to implement is org.xml.sax.EntityResolver. There's several methods that need to be implemented that are the different ways the SAX parser could query for stuff. Basically it will give you the Public ID and/or System ID and ask you to return a stream to what that resolves to. Then, in your code, all you do is run a hashmap that maps a given ID to a local resource (eg file or database BLOB) and then do your own stream opening/processing from there. I attempted to post some example code but see
      • Re: (Score:1, Insightful)

        by Anonymous Coward
        You're saying that it's impossible to correctly parse XML unless you can get a port 80 connection to a random internet host while doing it?

        Wow, XML is even more fucked up than I thought.
        • by acroyear ( 5882 )
          In general, you're right. The various standard languages built in XML do basically require and expect that you're able to look at files on the internet. XML wasn't designed for many of the private-network uses its being used for today, and neither were many of the tools that reference public DTDs.

          However, with most parsers there is a way to set up a mapping where it can look on the local box for a file matching a specific URL. JBoss, Hibernate, and Struts all do that for DTD files they have in their resp
          • Re: (Score:3, Insightful)

            by KrisWithAK ( 32865 )
            You are right. I wish I would have seen this article earlier so that I could have posted sooner -- and others to get to see the "solution"!

            Ever since I started developing on a laptop during my commute, I discovered that XML-based programs like J2EE servers would simply stop working. I experienced the same thing at work where, by default, your desktop applications (namely Eclipse) do not have access to the internet, and the servers will never have access to the "Internet".

            The proper thing to do is for your
        • by nuzak ( 959558 )
          Wow, XML is even more fucked up than I thought.

          Yeah, especially when you don't think at all. It has to do with validation, nothing to do with parsing.

          As for validation, should every single possible doctype anyone might think of be hardwired into the XML spec? Here's a free clue: DTDs and XSDs don't have to use http urls, and DTDs even have something called a "system catalog".

      • by Ozan ( 176854 )
        I always thought about having abstraction and redundancy introduced in a way that critical resources like the DTD get a real URN like urn:netscape.com:rss:0.91 and then parsers fetch the resource by looking up the URL in a special service that translates a URN to one or multiple URLs. With multiple ones, if the first one doesn't work another one can be tried.

        Basicly this system would do to URNs and URLs what DNS does to domain names and IP numbers.
        • by acroyear ( 5882 )
          well, as long as they're talking about an XML "2.0", maybe you should write to a committee member and submit the idea?
      • HTML has tons of pre-defined entities but XML only has the core 4.


        Small correction: It actually has 5 (< > & " '). The last one wasn't in HTML4.
  • by Anonymous Coward
    what is the probably web?
  • Ugh (Score:4, Insightful)

    by SuperBanana ( 662181 ) on Sunday January 14, 2007 @10:35AM (#17602616)

    According to an article on DeviceForge.com quite a few RSS feeds around the probably web stopped working properly over the past few weeks because Netscape recently stopped hosting the critical rss-0.91.dtd file.

    STOP, Grammar time. Ooooh whoooaaa oh oh...

    Probably someone over at netscape.com simply thought he was cleaning up some insignificant cruft."

    Or Netscape got tired of people using their bandwidth. Regardless of the reasons: if you reference a file on someone's site, it's hardly their fault if they move/change/delete it, and it breaks your stuff.

    • Re: (Score:2, Interesting)

      by fatphil ( 181876 )
      But DTD's were designed to be precisely that. Likewise class paths in java.
      Unnecessary hard coding of something that's not necessarily non-ephemeral.
      I never liked the idea, I'm glad to see that some of my worries are well-founded.
      • The point... (Score:4, Insightful)

        by Junta ( 36770 ) on Sunday January 14, 2007 @10:51AM (#17602756)
        Is to have a common component shared among many documents without replication.

        Class paths is java are the perfect example to say how it *should* work. Java CLASSPATHs in every application/installation I have seen are site-local, all paths accessible without going over the internet to another site to get classes.

        To be similar, an RSS site should copy this DTD to their local server, or to a server with which they have a concrete understanding of the relationship. Either a commercial agreement with a peer or at least using a server from an organization who explicitly defines the purpose of hosting to be a common place to promote it as a standard.

        Did netscape promise itself to be an organization sharing that DTD explicitly, or did site developers get in the practice because 'it just always was there'?
    • DTDs are different (Score:5, Interesting)

      by pikine ( 771084 ) on Sunday January 14, 2007 @11:02AM (#17602866) Journal
      A DTD (document type definition) is a file that describes how an SGML document is structured. In this case, the DTD that went missing defines RSS 0.91, which is used by Navigator 4 for "channel" subscription.

      It is expected that DTDs are hotlinked. For example, if you ever look at html source of a web page, you would see:

      <!DOCTYPE ...>
      on the top, and the hotlink goes to somewhere on w3.org. That is because W3 is the authority body that defines the html.

      Since Netscape is the authority body that defines RSS 0.91, it is a bit strange how they stopped hosting the definition.

      In any case, the missing definition won't affect software that processes RSS feeds. It only affects software that checks whether a SGML document is structured properly according to that missing DTD.

      The main interest to this article seems to be the speculation how a deprecated web 1.0 company could end up hiring a clueless webmaster who deletes important files without recognizing its importance.
      • Since Netscape is the authority body that defines RSS 0.91, it is a bit strange how they stopped hosting the definition.

        It's not strange; it's just their way of telling everyone that RSS 0.91 is dead and people should stop using it, apparently.

    • What? You've never heard of a probability web before? ...Holds the improbability drive at NS in place?
    • Re: (Score:3, Insightful)

      Comment removed based on user account deletion
  • by Programmer_Errant ( 1004370 ) on Sunday January 14, 2007 @10:38AM (#17602642)
    "A distributed system is one in which the failure of a computer you didn't even know existed can render your own computer unusable."
  • SPOF? (Score:2, Insightful)

    by basketcase ( 114777 )
    Or maybe some smart person at Netscape decided to teach some people a lesson about using a 3rd party as a single point of failure?
  • by gravesb ( 967413 ) on Sunday January 14, 2007 @10:46AM (#17602724) Homepage
    And if so, why would anyone rely on AOL to make something on the web work?
  • HAHA (Score:5, Funny)

    by eMbry00s ( 952989 ) on Sunday January 14, 2007 @10:47AM (#17602732)
    Suck that, Web 2.0!
  • by nascarguy27 ( 984493 ) <nascarguy27&gmail,com> on Sunday January 14, 2007 @10:58AM (#17602826)
    This is the precise reason why I host everything myself including my own series of tubes, dubbed the Internets. I host not only every file that my site uses, but I also have a program that regularly crawls the entire Internet and compresses it onto my own distributed system. That way I can browse the Internet by myself without worrying if someone else's system will fail. Although I do need to replace systems every now and then. But that's not a problem, b/c the distributed system has 3-5 copies of the Internet, each copy in a different place. Wait, isn't their some other company that does that? I can't quite place the name.

    Seriously though, relying on some other system so your site will work is a recipe for disaster. It's similar to relying on someone to take you to work everyday. After a while, you get used to that fact that someone else is driving you that you don't even think about it. Then your driver gets deleted somehow. And you're stuck with no way to work.
    • Then your driver gets deleted somehow. And you're stuck with no way to work.

      It's even worse if your driver gets somehow deleted while you're on the road. Happens to my XP box all the time.
  • by Anonymous Coward
    Whenever someone accesses a RSS file, Netscape would know the IP for every access? How stupid can that be? Why don't the readers just cache the DTDs and fetch only if there's a problem?
    • by XMyth ( 266414 )
      Because of abstraction (the lower level XML library is what would retrieve the DTD, but it'd only receive the raw XML data to be parsed/validated) the request for the DTD almost certainly doesn't contain the referrer. Even if it COULD contain the referrer, there's no real reason why it should IMHO.
    • Exactly. The problem is that you're supposed to have an XML catalog that lists the location of the DTDs on the local filesystem (at least, that's how 4suite works). If you don't set that up, a validating parser can either fail or fetch the DTD from the URL specified in the document (personally, I prefer that it just fail so that I can fix the problem, but it's not the default).

    • Yes, this is a known issue with XML/SGML. In fact external DTDs can do more than just log the usage, they can also be cause of various security issues. See:

      http://www-128.ibm.com/developerworks/xml/library/ x-tipcfsx.html [ibm.com]
  • by Morky ( 577776 ) on Sunday January 14, 2007 @11:11AM (#17602922)
    This could seriously affect both of the guys using Netscape.
  • How is this any different than the W3C doctypes ala

    ? Most web tutorials tell you to use the remote file....but if it ever goes down what happens? Is there a reason for it being taught this way?
    • Re: (Score:2, Insightful)

      by DavidTC ( 10147 )

      It's not any different, except the w3c is run by intelligent people, and Netscape, apparently, is not.

      I've always thought the full paths were a bit stupid too, and they should have some sort of shortcut standard, one that says "Use w3c's HTML4.0 standard", and the web browser knows how to contruct a path to find w3c standards. That way, when "Use netscape's RSS0.91" standard stopped working, web browsers could have a trivial update, or their config could even be changed manually, to tell them where to find

      • Granted, they already have something like this in the DOCTYPE, that's what '-//W3C//DTD HTML 4.01//EN' is, but then they blow it by then including the path after that. The parser should, instead, have to look at W3C and go 'Hey, I know where that is, that's w3c.org' and construct a standardized path using 'DTD HTML 4.01', like 'http://w3c.org/doctype/HTML4.01.dtd'. (And I just realized that string mysteriously doesn't include 'strict' or whatever in it, so now I'm slightly confused as to what good it's for.
  • by gastropod ( 105661 ) on Sunday January 14, 2007 @11:36AM (#17603114)
    From April 2001, "Netscape removed the RSS 0.91 DTD from their website. This means that all RSS feeds which depend on the RSS 0.91 (many, MANY news sites) cannot be used with a validating parser." [slashdot.org]

    It seems as though it just took them 5+ years to follow up on the threat? Primary links are broken, but of course the lively /. discussion (which, um, I haven't read) remains.
  • Sorry about that (Score:5, Informative)

    by christopherfinke ( 608750 ) <chris@efinke.com> on Sunday January 14, 2007 @11:42AM (#17603162) Homepage Journal
    my.netscape.com is undergoing a redesign, and when we announced the redesign [netscape.com] about 10 days ago, the DNS entry for my.netscape.com was changed to point to the new server where My Netscape will be living. This had the effect of making anything under the old my.netscape.com unavailable, since the only thing public on the new server is a splash page. (Nobody on the team was especially aware of this DTD file since all of the old Netscape employees were let go last year around the time Netscape.com was redeveloped; anybody working at Netscape now was hired since then.)

    Now, why this file was living under my.netscape.com is anybody's guess, but we'll have it restored ASAP. I only wish that someone had brought it to our attention so that I didn't have to find out about it from Slashdot.

    Christopher Finke
    Netscape Developer
    • Re:Sorry about that (Score:5, Informative)

      by mmurphy000 ( 556983 ) on Sunday January 14, 2007 @12:04PM (#17603338)

      What's the official way to let you know about this sort of thing? I'm not trolling -- the better you can inform folk like us about how to interact with you, the more likely it is you'll get a response when you need it. For example, a quick scan of the Help and FAQ pages linked to off of the Netscape home page shows no mention of how to contact folk like you.

      • Re:Sorry about that (Score:5, Informative)

        by christopherfinke ( 608750 ) <chris@efinke.com> on Sunday January 14, 2007 @12:16PM (#17603440) Homepage Journal
        What's the official way to let you know about this sort of thing?
        You're correct that contact information appears to be MIA in the Netscape Help pages; I'll make sure to remedy that ASAP.

        For something as serious as this, a user could have checked the profile of one of the Netscape Anchors or developers, where many of them list their screennames or websites, and subsequently, their e-mail addresses. (At least, I know I do [netscape.com].) Alternatively, any Netscape.com member could use Netscape sitemail to contact any of the staff members. Obviously, these are unacceptable for normal circumstances, but I wouldn't call this situation a normal circumstance.
    • by Sleepy ( 4551 ) on Sunday January 14, 2007 @12:18PM (#17603468) Homepage
      This blast is not squarely aimed at you, but you triggered it. Treat this in the spirit it is meant please (if I didn't give a crap at all, I wouldn't comment. Show this to your insulated bosses who don't know the first thing about community and transparency. Kudos to you BTW for showing initiative and acting on a Slashdot post. Honestly, I would not have given the "new Netscape" that much credit.).

      >I only wish that someone had brought it to our attention so that I didn't have to find out about it from Slashdot.

      This rankles.

      Have you EVER tried contacting Netscape from the outside world? Seriously, I can count the number of times:

      *) When my.netscape.com locked out Konqueror (1998?)
      *) When my.netscape.com WITHDREW the ability to embed RSS feed on your "my" page -- actually this was PRE-RSS if I recall. Way before it was commonplace, you could embed Slashdot and Linux Today feeds. Then they killed it, presumably because they got enough users or some pointy haired reason. 1999.
      *) When my.netscape.com adopted a shitty policy of DELETING all your mail if you don't login for 30 days. This did not seem to be publicised by an actual email. They don't seem to delete the mailbox itself, which violates RFCs I'm sure and basically insinuates the mailbox is active. I lost tons of mail from 1996-2003 (yeah yeah backups. Some things I didn't think I would need later). ?? Happened in 2003. Note that mailboxes were only 5MB still, so I quickly bailed for a 100MB Yahoo account.
      *) The 2001 deletion of Netscape Developer. This lost a ton of Netscape copyrighted Javascript documentation.

      Just TRY contacting Netscape from their page. The best you can do is use the WRONG FORM to submit to some contracter who won't forward it. Or, oh yeah - there's a 900 number for by the minute Support.

      Back when it mattered, there was no 'Google Guy' for Netscape, who would act as an unofficial liason. After Jamie Z left, no one internally tried to fill the shoes of a community facing employee.

      While I'll be eternally grateful for Netscape's open sourcing of their browser. What a different world it is now. Too bad that step is something the current management would never have allowed (that's the perception). I can't think of a more opaque Internet company than today's Netscape. I'm sure there are people who disagree or wish it could be changed (you're here..) but that and a $1 gets you a cup of black coffee. Show this to your boss - there are suggestions here :-)
      • by christopherfinke ( 608750 ) <chris@efinke.com> on Sunday January 14, 2007 @12:25PM (#17603532) Homepage Journal
        You make several good points that I want to respond to more fully, but I've got to run out, so I'll have to do that later. In the meantime, I'll put this out there: my e-mail address is chris@newnetscape.com [mailto]; my screenname and other contact information is available at my website [efinke.com]. Anyone who wishes to do so can contact me regarding issues with any of the Netscape websites or the Netscape browser; if I can't solve your problem, I can definitely get you connected with the right person.
        • Again, I appreciate your openness and availability, but if I have a problem with a Netscape website, the first place I'm going to look is not going to be the website of an individual developer, hosted over at efinke.com. I'm going to go to the Netscape Help page, and if I can't find a simple way to report a problem there I'm going to leave.

          Your entire company needs to make a commitment to open dialogue, not just you individually.
      • Re: (Score:2, Informative)

        by Alphab.fr ( 897672 )
        ) The 2001 deletion of Netscape Developer. This lost a ton of Netscape copyrighted Javascript documentation. Unless I'm mistaken, this has been (quite some time afterwards) transfered to the mozilla fundation, and can be accessed at http://developer.mozilla.org/en/docs/JavaScript [mozilla.org] Cheers,
      • Re: (Score:3, Interesting)

        I can't speak for the pre-2006 Netscape, but as far as the current Netscape organization is concerned, I have to disagree with you regarding transparency.

        * The former GM, Jason Calacanis [calacanis.com], blogged extensively about Netscape and encouraged other employees to do the same. He also called out other industry big-wigs [calacanis.com] for "not having time" to blog about their product.
        * Many [fabienne.us] Netscape [neothoughts.com] Anchors [robotskirts.com], Navigators [themulife.com], and developers [efinke.com] have an active blog [sampletheweb.com] where they write about Netscape and/or are available to discuss it.
        * There i
    • by martyb ( 196687 ) on Sunday January 14, 2007 @12:18PM (#17603472)
      (Nobody on the team was especially aware of this DTD file since all of the old Netscape employees were let go last year around the time Netscape.com was redeveloped; anybody working at Netscape now was hired since then.)

      Now, why this file was living under my.netscape.com is anybody's guess, but we'll have it restored ASAP. I only wish that someone had brought it to our attention so that I didn't have to find out about it from Slashdot.

      Ummm, maybe I'm mising something here, but I would think that your web log would show a spike in 404 errors for this file, right? In my experience, it is helpful to assume that I do not know what I don't know, and to put procedures in place to help make those omissions stick out. So, a scan of your log files not only for this file, but for any others that also have a high number of 404's (especially from a multitude of referers) would be worth investigating.

      BTW, best of luck on the redesign!

      • Re: (Score:3, Informative)

        by VGPowerlord ( 621254 )
        They removed every file, causing a spike in 404s for all of them.
        • by martyb ( 196687 )

          Doh! <wipes egg from face>. Point made. I would still love to get my hands on their logs to see what other important pages may be failing.

    • by Anonymous Coward
      It is not the first time that Netscape dropped important files and Netscape didn't care in the past. E.g. the RDF schemata http://home.netscape.com/WEB-rdf [netscape.com] and http://home.netscape.com/NC-rdf [netscape.com] bite the dust some time ago and Netscape gave a fucking fart. Other format specs are also gone for years, but I am to lazy to look up what whent missing years ago.

      But hey, from where I sit Netscape is anyhow run by a bunch of liberal arts graduates and doesn't have any technical competence left.
    • Re: (Score:2, Interesting)

      by gjuk ( 940514 )
      1) Isn't it great that the guy comes on and is open and helpful? There are plenty of organisations could learn from this

      2) I found it amusing that the /. summary states "quite a few RSS feeds around the web probably stopped working properly" - what; so perhaps none stopped working? Would be great to see a list of ones which actually did.... anyont?
      • The feeds, I think, are working fine: they just spit out the DOCTYPE line normally. The problem occurs when feed readers, which use the DOCTYPE line to get the DTD to make sure the document is formatted just like the DTD says it is, try to download the DTD from Netscape's website. Since they get a 404, they assume something is wrong with the feed, and tell you as much.

        According to the article, Google and Firefox's feed readers don't mind that the DTD's gone missing, but Microsoft Live's RSS Feed Gadget does
    • Re:Sorry about that (Score:4, Interesting)

      by dubl-u ( 51156 ) * <<ot.atop> <ta> <2107893252>> on Sunday January 14, 2007 @01:06PM (#17603978)
      Hi, Christopher. First off, full marks for stepping up and explaining things honestly. You have done more good for Netscape than a dozen PR people. I'm sure you'll take a lot of crap from my fellow Slashdotters, but don't let it throw you. Listen to and acknowledge their legitimate complaints and you'll do fine.

      I only wish that someone had brought it to our attention so that I didn't have to find out about it from Slashdot.

      If you are looking to learn a lesson from this, how about this one: URLs are forever!

      Whenever I make a change to a live server, my biggest concern is to not break existing usage. If I ever change an URL, I make sure to redirect old usage to new usage that's just as good. And if I'm ever not sure something is used, I generally look back at least three months in the logs. Especially if you've inherited a pile full of mystery, good analytical tools for your server logs are vital. Trying to run even a modestly-sized site without them is like running a large store without tracking your inventory: your life will become a series of unfortunate surprises.
      • Re:Sorry about that (Score:5, Informative)

        by christopherfinke ( 608750 ) <chris@efinke.com> on Sunday January 14, 2007 @01:29PM (#17604190) Homepage Journal
        URLs are forever!
        Indeed, words to live by. I wouldn't pin this mistake on one person not checking the right logfile though; in a company as large as AOL, when an entire 150-person workforce is laid off and a new (much smaller) team is brought in to manage the old properties, things sometimes get lost in the shuffle. The entire my.netscape.com service happened to be one of those things. I'm sure that this incident will act as a reminder to never let this type of thing happen again.

        And BTW, it appears that the DTD file will be restored early tomorrow morning at the latest.
        • an entire 150-person workforce is laid off and a new (much smaller) team is brought in to manage the old properties, things sometimes get lost in the shuffle.
          I wonder if your beancounters would realize that a web property's value rapidly approaches zero when it is not maintained. Probably not. I guess they think netscape is a hole in the internet that you throw money into.
    • by v1 ( 525388 )
      so what was the inspiration to sack the entire netscape team?

      And this is not the first time it has happened. Every time I install netscape on an old iMac, it stalls out for like 90 seconds on initia launch because something else on your server the browser depends on has been deleted. A few times I had a very hard time installing Netscape because it refused to quit digging for that page on launch. I never did figure out exactly what it was it was looking for. (something to do with first run registration I
  • Huh? (Score:2, Redundant)

    Netscape? What's a Netscape?
  • by owlstead ( 636356 ) on Sunday January 14, 2007 @11:52AM (#17603240)
    If I would create a reader that was dependent on version 0.91 of the distribution, it sure as hell would include the DTD in local storage. It makes no sense to create a reader that can also use, say, version 0.92 since you would not know what had changed (and there is no such thing as inheritence between versions of a DTD afaik). Actually, as other readers noted, it would be terribly stupid to make your web-server or client rely on a third party computer for which you cannot guarantee the uptime.

    These URL's are mainly there for their Uniqueness, not so much as for a place of quaranteed storage. Of course, they are also a nice place to look for the actual definition, but after that you would need a local repository. This is the first thing an XML library should support, and the first thing a moderately intelligent programmer should look at. I get *very* annoyed if this kind of basic rules are ignored. And I've even seen them ignored by people pointing to the XML digital signature definitions, where security and reliability should be the first requirements in the design.

    Also, what would happen if w3c.org or netscape.com go the way of the Gopher? If they go bust? It's a quickly changing world out there.
    • by voidptr ( 609 )
      The DTD is spelled out in the incomming feed file though. Even if I had a local copy in my application, the RSS feed I just grabbed from Joe's Random Blog probably references the netscape copy, and there historically hasn't been a good mechanism to redirect that to a local DTD store.

      The URLs there were initially intended to be used as unique namespaces since each domain had someone that could guarantee uniqueness inside that domain, not really as actual retrievable URLs, although they were usually valid and
      • "... and there historically hasn't been a good mechanism to redirect that to a local DTD store."

        You mean like an EntityResolver? Of course I agree on the last statements.
        • by voidptr ( 609 )
          I said "historically". I've seen a half dozen Java XML parser toolkits recently, and not one mentioned setting that up anywhere in their docs. The hooks are there, but nobody knows to use it.
          • Ok, I'll have to agree on that. There are some seriously bad Java Docs out there. The XML digitital signature is another example. A programmer just following the docs would simply check if the signature would match, and think the document was signed. If you would send a message that signed signed something completely different, the program would still say that the signature was valid. That's just the URL that points to an incorrect locations, there are about a dozen other checks that need to be performed th
    • From the article:
      Apparently, a lot of feed readers and services (e.g. Firefox and Google) don't bother following links to dtd files, or may have their own cached versions available. But others -- Microsoft's Live.com RSS feed gadget is one example -- do check for the files, and refuse to load feeds if the referenced dtd file can't be located.
      Microsoft software at it's best again. Don't look suprised
    • by DavidTC ( 10147 )

      If the w3c collapses web standards are screwed anyway.

  • Logs? (Score:3, Interesting)

    by hhawk ( 26580 ) on Sunday January 14, 2007 @12:22PM (#17603512) Homepage Journal
    not trying to be a troll here.. but.. one would think that that file would have been accessed quite often and that would have shown up in the logs...

    If I was a new hire at some old company where everyone else had been let go, I'd at least check out the logs and see what is being used? and then if some file is being hit 1,000's of times a day.. maybe ask a few questions..
  • Last time I looked, if the RSS 0.91 feed references the DTD, IE7 refuses to display it anyways... it's only of the DTD references is removed that IE7 "works" "properly"
  • by Shaltenn ( 1031884 ) <Michael.Santangelo@gmail.com> on Sunday January 14, 2007 @01:43PM (#17604330) Homepage
    Once upon a midnight dreary, while I websurfed, weak and weary, Over many a strange and spurious bookmark of 'free news galore', While I clicked my fav'rite feed, suddenly there came a warning, And my heart was filled with mourning, mourning for my dear amour. "'Tis not possible," I muttered, "give me back my free news source!" Quoth the server, "404".
  • Anyone remember the incident, maybe 8 years ago, when the root DNS servers dropped the entry for localhost, 127.0.0.1? We had a lot of random code break because of that.
  • Wow.

    I'd swear today I was looking at the /. frontpage from 4+ years ago...

    After a little searching, I've found the exact same story, from April 2001:

    http://slashdot.org/article.pl?sid=01/04/28/211921 1 [slashdot.org]

    Those who forget history are doomed to repeat it.

    I look forward to reading this story on /. again in 2013.
  • The web needs some scheme for content based addressing. Like the urn:sha1 scheme used in gnutella. This (and some sort of reasonable caching scheme) would do a lot to alleviate problems like this. It could also help a lot with the Slashdot effect.

  • Referencing the other topic today...

    You mean to tell me that every RSS reader references - and actually tries to FIND and DOWNLOAD - a specific SPECIFICATION hosted on ONE SITE ON THE PLANET?

    Are you people utter fucking morons or what?

    I can't believe design decisions like this.

    I'm especially irritated because I have just spent the last week trying to find an rdiff-backup or rsync that functions on Windows WITHOUT A FUCKING 2GB or 4GB FILE SIZE LIMITATION! Even the Cygwin people could only tell me to "try it
  • The file is now restored, but it will not be available forever. See this post at the Netscape blog [netscape.com] for the full details.

Think of it! With VLSI we can pack 100 ENIACs in 1 sq. cm.!

Working...