Independent Data and Formatting with Microformats 99

Posted by ScuttleMonkey on Tuesday July 11, 2006 @07:21PM from the web-fun dept.

IdaAshley writes to tell us IBM DeveloperWorks is running an article about how to best utilize microformats to embed data within standard XHTML code. From the article: "Microformats are a pragmatic approach to solving the issue of structured data on the Web. Is it as architecturally pure as XML-encoded data separated from its formatting through a mechanism such as XSLT style sheets? No. But I think this approach is a realistic middle step that will help build a more intelligent Web that is easier to use and provides better search and data integration."

This discussion has been archived. No new comments can be posted.

Independent Data and Formatting with Microformats

Load All Comments

Search 99 Comments Log In/Create an Account

Comments Filter:

Geez, man... (Score:3, Insightful)

by Chysn ( 898420 ) writes: on Tuesday July 11, 2006 @07:26PM (#15702257)

Some of us have been doing this for YEARS. At least now we have a buzzword for it.

Share
twitter facebook
- Re:Geez, man... (Score:2, Informative)
  
  by ChaoticChowder ( 971057 ) writes:
  
  I just wrote a Java program to do all that in one step last week. I even took it a step further and used the Sun classes for parsing HTML and Xerces for XHTML. Anyone who has ever had to do a datamining project knows how to do this. I don't really think this is a big deal at all. Just another excuse to apply a Web 2.0 buzzword to a technique that's been around for quite a while. Tutorials on the web these days are getting to be pretty lame. Maybe I'll write a couple myself, at least I have the chance
- Wheel of re-incarnation strikes again... (Score:2, Informative)
  
  by sreekotay ( 955693 ) writes:
  
  Mixing presentation and data - good... bad... good. But it gets better a little, each time (maybe more of a spiral than a wheel).
  
  We're using them on aim pages [aimpages.com] for module development (I cover it a bit here [kotay.com]). Its a nice simple standard, and the idea needed SOME name - don't make more of it than it its.
  -----
  graphically speaking [kotay.com]
- Re:Geez, man... (Score:4, Insightful)
  
  by frisket ( 149522 ) writes: <{peter} {at} {silmaril.ie}> on Wednesday July 12, 2006 @05:10PM (#15708441) Homepage
  
  Some of us have been doing this for YEARS. At least now we have a buzzword for it.
  There is already a buzzword: tag abuse. It's the last resort of the untalented.
  This particular version is known as semantic imputation (giving things meanings they don't inherently have). It's neither new, special, exciting, nor useful, but at least we now know how little the people at IBM and Leverage Software know about markup and XML.
  I guess I'd better add a warning to the XML FAQ [silmaril.ie] about it...
  
  Parent Share
  twitter facebook
META headers (Score:1, Interesting)

by RobotWisdom ( 25776 ) writes:

How much of this could have been done 5 years ago if the structured-HTML community hadn't blindly rejected META headers?
- Re:META headers (Score:4, Informative)
  
  by Anonymous Coward writes: on Tuesday July 11, 2006 @07:51PM (#15702379)
  
  Get off your hobby-horse, Jorn. At some point, please realise that you are clueless about markup. Only then will you be able to learn a bit about what you are so high-and-mighty about.
  
  Firstly, <meta> is an element type, not a header. It doesn't do your credibility much good when you don't even know what it is.
  
  Secondly, <meta> is an astonishingly limited element type. It's scoped to the page not particular parts of it, and it has a plain-text content model because it uses attributes instead of child elements.
  
  Thirdly, I anticipate you saying that you could fix this by changing the <meta> element type. Sure you could. You could fix it by changing it to a set of element types that describe content more accurately and changing it so that it could appear in other parts of the document. And you know what you'd have then? The structured HTML that you despise so much. That's right, microformats embody the very thing you are criticising.
  
  Finally, given that HTML hasn't changed recently to allow microformats, everything that is possible today with microformats was possible five years ago with microformats. It's a design strategy, not a new technology.
  
  Again, please learn a bit about something before you turn your nose up at it. You might be smart in other respects, but when it comes to markup, you are dumb. Please accept this so you can change it.
  
  Parent Share
  twitter facebook
  - Re:META headers (Score:1, Funny)
    
    by Anonymous Coward writes:
    
    Looks like someone has been trolled up the ass badly.
  - Re:META headers (Score:2)
    
    by RobotWisdom ( 25776 ) writes:
    
    Get off your hobby-horse, Jorn
    Strangely enough, I just asked a question: "How much of this could have been done 5 years ago if the structured-HTML community hadn't blindly rejected META headers?"
    So if anyone is on a hobbyhorse, it's Mr Coward.
    It's scoped to the page not particular parts of it, and it has a plain-text content model because it uses attributes instead of child elements.
    So I guess I have to ask again: How much of microformats could have been done using META, given that it's scoped to
    - Re:META headers (Score:5, Informative)
      
      by Karma Farmer ( 595141 ) writes: on Tuesday July 11, 2006 @09:35PM (#15702796)
      
      How much of this could have been done 5 years ago
      All of it. Microformats use features introduced with HTML 4.0 in 1997, so all of this was possible nearly 10 years ago.
      
      How much of microformats could have been done using META
      None of it. META tags and microformats serve two entirely seperate purposes, and neither is in any way a replacement for the other.
      
      Parent Share
      twitter facebook
      - Re:META headers (Score:2)
        
        by RobotWisdom ( 25776 ) writes:
        
        How much of microformats could have been done using META... None of it.
        I just don't believe that. If you're describing one or more events, why can't you put most or all of those descriptions in META format?
        For me, the worstcase for the waste in rejecting META is that we could have been putting Yahoo/Dmoz categories there, all this time, but haven't been because the 'cult' didn't think it was fancy enough.
    - Re:META headers (Score:3, Informative)
      
      by oneiros27 ( 46144 ) writes:
      
      So I guess I have to ask again: How much of microformats could have been done using META, given that it's scoped to the page (which is no problem for the most important page semantics), and uses attributes?
      
      Very little. For instance -- if I had a full page calendar display -- because META is scoped to the whole page, I couldn't include an event record for each individual event -- I'd have to have the person go to a 'more information' link, and then give the event information. If I wanted them to do tha
      - Re:META headers (Score:2)
        
        by RobotWisdom ( 25776 ) writes:
        
        I couldn't include an event record for each individual event
        I'm not convinced you've really tried-- suppose the METAs described an event1, an event2, etc, and whatever Firefox extension is tasked with extracting that info could look for flags in the body that show which event is described where?
        Microformats seem to be a classic 'bag taped to the side', because the logic of the semantic web was still poorly visualised when XML was selected. I'm just asking whether META doesn't deserve rehabilitation a
        
        Re:META headers (Score:1)
        
        by Hynee ( 774168 ) writes:
        
        People like using the HTML the way they do. Use newsgroups if it suits you better. Simple as that.
      - Re:META headers (Score:2)
        
        by RobotWisdom ( 25776 ) writes:
        
        I personally didn't like the examples given in the IBM article
        Is that
        <abbr class="dtstart" title="20060501">May 1</abbr> -
        <abbr class="dtend" title="20060502">02, 2006</abbr>
        
        crap really the best they've got? It makes my eyes bleed...
        (Since the association between the human- and machine- readable texts is wholly imaginary, why not keep the machine vesion in META?)
        
        Re:META headers (Score:2)
        
        by Baricom ( 763970 ) writes:
        
        In IBM's defense, it's not a format they've made up. hCalendar's [microformats.org] primary author is from Technorati [technorati.com].
- Re:META headers (Score:2)
  
  by stonecypher ( 118140 ) writes:
  
  It was all done 20 years before the web existed, as SGML. But thanks for playing.
Firefox (Score:1)

by Sir_Lewk ( 967686 ) writes:

I didn't know IBM used Firefox, I'd have figured that they had their own, "in-house" broswer. Neato
- - Re:Firefox (Score:2, Funny)
    
    by Sir_Lewk ( 967686 ) writes:
    
    Why shouldn't they have a in-house browser? they do have Lotus Notes.
    - Re:Firefox (Score:1)
      
      by siegesama ( 450116 ) writes:
      
      Neither Lotus Notes nor Lotus Sametime (which I'm expecting someone else to mention any moment now) are really "in-house". They're both applications produced by companies purchased by IBM, which are still marketable products. We're just "eating our own dogfood"
- Re:Firefox (Score:2)
  
  by Drooling Iguana ( 61479 ) writes:
  
  WebExplorer FTW!
- Re:Firefox (Score:2)
  
  by FooAtWFU ( 699187 ) writes:
  
  They sort of have an "in-house" edition of Firefox that you can install through the IBM Standard Software Installer (or which you might get preinstalled on your ThinkPad). It's effectively the exact same thing, has a few extra search engines maybe (for searching the intranet, the internal "blue pages" directory, et cetera) and a little string in the window titlebar... maybe a few icons are different here and there...
Tagging in Text (Score:1, Informative)

by inKubus ( 199753 ) writes:

This is just tagging in text; it's exactly what you do for CSS: You're saying this text is of a certain class. And you contain it in a box. All this is doing is using the same stuff and storing a little variable name and using it later. One might argue you are already doing that with CSS, it's just formatting stuff you're attaching to the variable rather than, ah, data structure..

I do like the idea of being able to move XML around without having to parse to view the basic file in a formatted fashion. S
- Re:Tagging in Text (Score:2, Insightful)
  
  by cdcarter ( 822001 ) writes:
  
  The difference between this and text tagging is that this has a set structure.
- Re:Tagging in Text (Score:5, Informative)
  
  by Mr_Tulip ( 639140 ) writes: on Tuesday July 11, 2006 @09:05PM (#15702676) Homepage
  
  The thing that makes Microformats stand out from homebrew versions is the attempt to standardize the formats, allowing others to easily work out what microformat you are using and integrate them into their own site.
  The article mentions the wiki [microformats.org], but doesn't link to it, except at the very bottom of the resources section.
  
  Parent Share
  twitter facebook
  - Re:Tagging in Text (Score:2)
    
    by inKubus ( 199753 ) writes:
    
    I was like, "I do this every day", so what? I see from the Wiki that it's like RSS, and they are trying to standardize the formats. Thanks. Everything is getting closer to being truely useful every day.
  - a standard that people are ACTUALLY USING! (Score:1)
    
    by MichaelMD ( 988372 ) writes:
    
    exactly! sure the idea of using css class names to represent something for a machine to read is not new as it is an obvious one. I thought of it too when I first saw CSS used just like I thought of using made-up tags to represent things when I first saw html ... but THAT IS NOT THE POINT - - the STANDARDISATION, the fact that LOTS OF PEOPLE ARE ACTUALLY STARTING TO USE IT, and the SIMPLICITY is what makes microformats interesting - For someone like me who has been looking for many years for ways to make
- Re:Tagging in Text (Score:3, Insightful)
  
  by stonecypher ( 118140 ) writes:
  
  I do like the idea of being able to move XML around without having to parse to view the basic file in a formatted fashion. So, you're mixing HTML with a tag. Again, SO WHAT? But what about the encapsulated text, what's the point?
  
  To make things application parsable. Try reading the article before complaining that you don't see the point.
  
  If you're going to use a viewer eventually
  
  If you'd bother to read the article, which is about comparing one application parsable format (iCal) to the new microformat, yo
LISP (Score:5, Insightful)

by Anonymous Coward writes: on Tuesday July 11, 2006 @07:56PM (#15702400)

I'm sure the LISP community would love to hear about this brand-new idea of embedding specialy, or domain-specific if you will, languages and data. How extraordinarilly novel.

You'll be running a limited LISP implementation on every browser in no time!

Share
twitter facebook
- Re:LISP (Score:5, Funny)
  
  by rblum ( 211213 ) writes: on Tuesday July 11, 2006 @08:49PM (#15702625)
  
  I wish the LISP community would finally stop whining and realize they're doing nothing we old farts haven't done in Turing machines!
  
  Parent Share
  twitter facebook
  - Re:LISP (Score:3, Interesting)
    
    by The_Wilschon ( 782534 ) writes:
    
    http://en.wikipedia.org/wiki/Turing_tarpit [wikipedia.org]
  - Re:LISP (Score:2)
    
    by hey! ( 33014 ) writes:
    
    You had Turing Machines? Sonny, you kids don't know how good you had it. Back in my day we had to quarry granite blocks, drag them hundreds of miles, then fuss with them, just so we'd know it our barley crop was safe from frost.
    
    And did we get appreciation for all that work? Hah. Some people must think slaves grow on trees. They don't. They just end up on 'em.
    
    Or maybe in flaming wicker baskets.
    
    Um, what were we talking about?
  - Re:LISP (Score:1)
    
    by thePowerOfGrayskull ( 905905 ) writes:
    
    I wish the old fart community would stop whining and realize they've done nothing we haven't done in... uh...
- Re:LISP (Score:1, Troll)
  
  by stonecypher ( 118140 ) writes:
  
  I love it when the LISP community pretends they invented things they didn't, and that it's going to lead to LISP being in places it'll never be.
  
  It's even better when they can't spell simple words like extraordinarily.
- Re:LISP (Score:3, Interesting)
  
  by fm6 ( 162816 ) writes:
  
  I know an old LISP hacker who simply doesn't understand all the fuss over XML. To him XML documents are just S-Expressions, only klunkier!
  - And he asked for a wake-up call . . . (Score:2)
    
    by rodentia ( 102779 ) writes:
    
    when browsers have built in support.
- sufficiently complicated (Score:2)
  
  by rodentia ( 102779 ) writes:
  
  Any sufficiently complicated C or Fortran program contains an ad-hoc, informally-specified bug-ridden slow implementation of half of Common Lisp. -- Phillip Greenspun's 10th Rule of Programming
Standardization is the problem (Score:5, Insightful)

by Anonymous Coward writes: on Tuesday July 11, 2006 @08:02PM (#15702433)

This suffers from the same thing XML did. Remember when XML was going to revolutionize communication between computers by structuring everything consistently? Then <lname> tripped over <lastname> which was crawling on the floor after being decked by <name last="Henry"/> who was rather pissed off after an argument with <name><last>Henry</last>&lt/name> and the whole thing went down in a pile of flames and is now relegated to being a 2MB configuration parsing library to embrace and extend "option=value".

So now why is this "vevent" class special, and who decided it would be "vevent" and not "scheduledevent" or "calendarevent" or "microsoftcalendarhassomethingforyoutodotoday"? Clearly as a human I can look at "dtstart" and think about it and realize that this means the starting date, but how does a computer know this? If the "semantic web" is going to take off, then we need semantics, and pronto.

Hopefully any standardization doesn't turn into a nightmare though. I used to develop in the healthcare insurance claims field, and the old NSF format for transmitting an insurance claim electronically was a horrible death-by-committee piece of work. It was as if nobody could come to a consensus and the committee decided to just throw everything in. You might look at your insurance card and think "gee I have an insurance ID number" but no, in the NSF, there were about 10 different blanks for insurance IDs, depending. Is it a Medicare number? Then it goes in the Medicare blank. God forbid the computer would have just one blank and assume that if you're billing Medicare then the number in the blank is probably a Medicare ID. Medicare was easy, there's just one. Medicaid in most states have a billion subcontractors, all with names that have nothing to do with "medicaid" so you simply had to maintain a magic list of insurance plans that changed every other year or so that used the Medicaid ID field. Or the separate fields for Blue Cross and Blue Shield. What about the states where you have BCBS as a single entity?

Anyway, I'm digressing (and ranting about a chunk of my ilfe I'd much rather forget). What's important in standardizing in semantics is identifying everywhere where things are identical and reusing semantics whenever possible. Decisions have to be made up front as to what is the relationship between "name" and "last name" (people have a name, which has a last name, yet companies have names that typically don't have a last name. What about a cat named "John K. Wibblesworth" how is that different from one named "Tama"?) Yet, take dtstart which is used here for a calendar event. Should we have "dtclassstart" for the first day of school?

Share
twitter facebook
- Re:Standardization is the problem (Score:5, Insightful)
  
  by Bogtha ( 906264 ) writes: on Tuesday July 11, 2006 @08:14PM (#15702485)
  
  Remember when XML was going to revolutionize communication between computers by structuring everything consistently?
  
  No. I do remember how a lot of clueless PHB-types ran around telling everybody that though. XML solves the parsing problem, not the semantics problem. It's languages built on top of XML that handle semantics.
  
  XML was never meant to solve the problem you are talking about. Parsing markup into a tree is a totally different concept to figuring out what the stuff in the tree means. The only people who ever thought XML had something to do with what you say were totally clueless about XML.
  
  So now why is this "vevent" class special, and who decided it would be "vevent" and not "scheduledevent" or "calendarevent" or "microsoftcalendarhassomethingforyoutodotoday"?
  
  It's special because it appears in the hCalendar specification [microformats.org]. The people who wrote the specification decided it would be "vevent". They intend to submit it to a standards body.
  
  Parent Share
  twitter facebook
  - Re:Standardization is the problem (Score:1)
    
    by MichaelMD ( 988372 ) writes:
    
    its also easy to remember for anyone who has dealt with iCal data... the names used in hCalendar are basically lower case versions of the equivalent iCal names and are used for the same things - so that makes it easier to convert from hCal to iCal and vice-versa
  - Re:Standardization is the problem (Score:2, Insightful)
    
    by Anonymous Coward writes:
    
    XML solves the parsing problem, not the semantics problem.
    
    What parsing problem? Parsing is one of the most well-understood areas of computer science. Any comp. sci. graduate should be able to knock up a simple recursive descent parser, and there are dozens of good parser generators out there. It is the lack of semantics that makes XML little better than plain text — all the hard problems are left to applications.
    - Re:Standardization is the problem (Score:2)
      
      by Bogtha ( 906264 ) writes:
      
      What parsing problem?
      The problem of "I have a load of data that I need to be able to store and then restore into an easily manipulatable structure in memory."
      Parsing is one of the most well-understood areas of computer science.
      Just because it's a well-understood area, it doesn't mean data magically leaps out of files into data structures, does it? There's still a problem of actually implementing it.
      It is the lack of semantics that makes XML little better than plain text
      I assume by "pl
- OK. Who else... (Score:3, Funny)
  
  by frank_adrian314159 ( 469671 ) writes:
  
  Who else read this:
  If the "semantic web" is going to take off, then we need semantics, and pronto.
  as:
  If the "semantic web" is going to take off, then we need semantics, and porno.
  - - Re:OK. Who else... (Score:2)
      
      by frank_adrian314159 ( 469671 ) writes:
      
      Oooooh! Somebody woke up cwanky this mowning. What's the matter? Bad bottle of milk?
- Re:Standardization is the problem (Score:5, Informative)
  
  by TedTschopp ( 244839 ) writes: on Tuesday July 11, 2006 @08:53PM (#15702642) Homepage
  
  So now why is this "vevent" class special, and who decided it would be "vevent" and not "scheduledevent" or "calendarevent" or "microsoftcalendarhassomethingforyoutodotoday"?
  
  The idea is to leverage standards that are already out there, and in this case it would be the iCalendar standard.
  
  Parent Share
  twitter facebook
  - Re:Standardization is the problem (Score:2)
    
    by cerberusss ( 660701 ) writes:
    
    "Leverage"... *scratch*
    "standard"... *scratch*
    "XML"... *scratch*
    "microsoft"... *scratch*
    
    Bullshit! [bullshitbingo.net]
- Re:Standardization is the problem (Score:1)
  
  by KingMotley ( 944240 ) writes:
  
  Not to get off topic, but there are many reason why NSF and ANSI 837 can support multiple ID's, like: A) Coordination of Benefits. Depending on who you are sending the claim to, if the insured has multiple insurance plans, one insurance companies pay out may differ depending on what the other insurance payouts are. In some cases a primary insurance plan may need to forward the claim to a secondary or tertiary insurance company that uses a different ID. B) Better insured matching. If you supply an insured
- Re: Standardization is the problem (Score:2)
  
  by scdeimos ( 632778 ) writes:
  
  Decisions have to be made up front as to what is the relationship between "name" and "last name" (people have a name, which has a last name, yet companies have names that typically don't have a last name. What about a cat named "John K. Wibblesworth" how is that different from one named "Tama"?)
  And how do you classify people who have just one name, like "Virgil?" I don't mean "Virgil Williams" or "Andrew Virgil," just "Virgil." Is that his first name, last name or something else altogether?
  The problem wit
  - Re: Standardization is the problem (Score:1)
    
    by ktdid ( 988309 ) writes:
    
    I would imagine that the answer would be either with class="fn" or maybe class="nickname" since the hcard standard pretty follows the vcard standard: http://www.ietf.org/rfc/rfc2426.txt [ietf.org]
  - Re: Standardization is the problem (Score:2)
    
    by martin-boundary ( 547041 ) writes:
    
    Virgil is not his last name. His full name is Publius Vergilius Maro. So you would probably fill in the name field as Publius V. Maro.
- Re:Standardization is the problem (Score:4, Insightful)
  
  by stonecypher ( 118140 ) writes: <<stonecypher> <at> <gmail.com>> on Tuesday July 11, 2006 @10:54PM (#15703098) Homepage Journal
  
  This suffers from the same thing XML did. Remember when XML was going to revolutionize communication between computers by structuring everything consistently?
  
  Yeah. It works when you use the same DTD, which was the promise. It's not XML's fault that you and your supplier can't get your ducks in a row. The purpose of XML is to provide a medium that two ends can use to standardize a communications format of their own design, while giving a regular form to said formats so that arbitrary formats could be supported by arbitrary tools. It fulfills this ideal quite well, as anyone even vaguely familiar with web standards knows. It is not meant to magically merge two inconsistent standards.
  
  Then <lname> tripped over <lastname> which was crawling on the floor after being decked by <name last="Henry"/> who was rather pissed off after an argument with <name><last>Henry</last>&lt/name>
  
  Yeah. And that's XML's fault how? Get a DTD and stick to it.
  
  and the whole thing went down in a pile of flames
  
  Yeah, essentially every office suite, database, most graphics editors, many layout programs, and quite a few games support XML. Jabber / Google Chat run on XML. The web is built on an SGML dialect, which is largely being converted into an XML dialect; XML is itself an SGML dialect. Web 2.0 (god I hate that name) is an outcropping of XML's parsability. XML is so useful that Microsoft was able to use it to ward Massachusettes' lawsuits off. The United Nations now releases their transcripts solely in XML. XML is now the second most pervasive data storage format on earth, after CSV/TSV, and it's gaining fast. (Don't bother saying SQL - it's an API, not a storage format.)
  
  Exactly what is your definition of "going down in flames" ?
  
  and the whole thing went down in a pile of flames and is now relegated to being a 2MB configuration parsing library to embrace and extend "option=value".
  
  Uh, TinyXML has a footprint of 40k, champ. Also, that's not what "embrace and extend" means.
  
  So now why is this "vevent" class special, and who decided it would be "vevent" and not "scheduledevent" or "calendarevent" or "microsoftcalendarhassomethingforyoutodotoday"?
  
  What a surprise, the guy who couldn't standardize on a DTD now fails to understand other format standardizations. Read the article, champ. It's not SlashDot's job to read for you, and this one's honestly pretty simple. Indeed, the specific purpose of microformats is to address your whining, but you don't see the point. Cough.
  
  Clearly as a human I can look at "dtstart" and think about it and realize that this means the starting date, but how does a computer know this?
  
  Er, by supporting a specific microformat. Are you putting in effort to be dense? It's the same way they support iCal, or MS Word files, or in fact any format at all, ever.
  
  If the "semantic web" is going to take off, then we need semantics, and pronto.
  
  This has nothing to do with the semantic web. You want to drop another? Ontological Web Language sounds important too. Use that one more often: fewer people will see through you.
  
  God forbid the computer would have just one blank and assume that if you're billing Medicare then the number in the blank is probably a Medicare ID.
  
  Yes, I'm sure the people billing Medicare who aren't using Medicare IDs will be greatly amused that your application just fails for them. Why is it that I don't believe you had much to do with the design of the system?
  
  What's important in standardizing in semantics is identifying everywhere where things are identical and reusing semantics whenever possible.
  
  "Semantics" aren't reusable. They're not arbitrarily applied. Please stop using words you fail to understand. Not every markup of data is semantic, even if the markup means something. Semantics are the work of understanding context, not identifying relations
  Read the rest of this comment...
  
  Parent Share
  twitter facebook
  - Re:Standardization is the problem (Score:3, Insightful)
    
    by grcumb ( 781340 ) writes:
    
    " Then <lname> tripped over <lastname> which was crawling on the floor after being decked by <name last="Henry"/> who was rather pissed off after an argument with <name><last>Henry</last></name> "
    "Yeah. And that's XML's fault how? Get a DTD and stick to it."
    Well, actually, schema and RDF were supposed to address exactly that issue. So, in the opinion of the W3C, at least, it appears 'Get a DTD and stick to it' isn't the complete answer.
    But that's a simplistic
    - Re:Standardization is the problem (Score:2)
      
      by stonecypher ( 118140 ) writes:
      
      Yeah. And that's XML's fault how? Get a DTD and stick to it."
      
      Well, actually, schema and RDF were supposed to address exactly that issue.
      
      Schema is a replacement for DTD, because DTD has some subtle problems. RDF is actually for describing what's available on a service, not what's contained in one document; in a weird sort of way it's a conceptual parallel to the two for servers.
      
      That all said, it's worth noting that XML considers its data type as a critical and un-removable part of the document. So, sure,
  - - Re:Standardization is the problem (Score:2)
      
      by stonecypher ( 118140 ) writes:
      
      DTDs provide structure but no meaning beyond what humans ascribe to it.
      
      Uh, yeah, that's because that's what they're for. I said the reason he didn't have structure was because he didn't provide an appropriate structure document. Now you're trying to rebutt me by saying that they're really only for structure.
      
      I fail to see the disconnect here.
      
      Likewise, you claim that meaning is not reusable and are not arbitrarily applied, yet you chose to define bug as a Volkswagon, rather than as a vehicle, car, or automo
  - Re:Standardization is the problem (Score:1)
    
    by SporkLand ( 225979 ) writes:
    
    I'm not disparaging, I'm genuinely wondering:
    "Semantics are the work of understanding context, not identifying relationships."
    
    Isn't the work of "understanding context" simply identifying the relationships between certain items in your data and other items. Which may involve discovering further relationships?
    
    I'm not saying this to be a jerk, I really thought that semantics were derived by understanding the relationship between items.
    - Re:Standardization is the problem (Score:2)
      
      by stonecypher ( 118140 ) writes:
      
      Isn't the work of "understanding context" simply identifying the relationships between certain items in your data and other items. Which may involve discovering further relationships?
      
      No.
      
      Basically, the issue is this. Semantics are specifically the case of attempting to discern the meaning of a word given its usage. When you have something that says "anything in this column is a FOO," then there's no need for semantics: usage is moot, as the meaning of what's in that column is absolutely described. Semanti
      - Re:Standardization is the problem (Score:2)
        
        by stonecypher ( 118140 ) writes:
        
        Forgetting to close the <tt> on code is for the lose. Sorry about the eye-pain.
- Re:Standardization is the problem (Score:2)
  
  by Phreakiture ( 547094 ) writes:
  Decisions have to be made up front as to what is the relationship between "name" and "last name" (people have a name, which has a last name, yet companies have names that typically don't have a last name.
  
  Never mind company names; names of persons can be extremely difficult to parse. That which we call a "last" name is usually better described as a "family" name. Consider the following names:
  
  John Smith
  Wu Xue Jen
  Juan Carlos Jimenez Garcia
  Rev. Dr. Martin Luther King, Jr., PhD
  All four of these na
I don't get it... (Score:5, Insightful)

by grumbel ( 592662 ) writes: <grumbel+slashdot@gmail.com> on Tuesday July 11, 2006 @08:27PM (#15702550) Homepage

Ok, so this "microformats" thing is about encoding extra data inside an HTML file by abusing CSS class names for markup, isn't that completly unnecessary and nothing more than an ugly hack? Don't we have XML namespaces for exactly that reason? Wouldn't something like: <span style="display: none"> <vevent:event> <vevent:dtstart>20060501</vevent:dstart> <vevent:dtend>20060502<vevent:dtend> <vevent:summary">My Conference opening</vevent:summary> <vevent:location>Hollywood, CA</vevent:location> </vevent:event> </span> We the 'right'[tm] way to day it?

Share
twitter facebook
- Re:I don't get it... (Score:5, Informative)
  
  by Karma Farmer ( 595141 ) writes: on Tuesday July 11, 2006 @09:22PM (#15702741)
  The class attribute was never intended to be limited to CSS. From the HTML 4.01 specification:
  The class attribute... assigns one or more class names to an element; the element may be said to belong to these classes. A class name may be shared by several element instances. The class attribute has several roles in HTML:
  
  As a style sheet selector (when an author wishes to assign style information to a set of elements).
  For general purpose processing by user agents.
  Parent Share
  twitter facebook
  - I'll get it. (Score:2)
    
    by rodentia ( 102779 ) writes:
    
    I don't believe it was intended to contain an alias (in Sowa's sense) or a general nomenclatura, however. This innovation actually undercuts the *semantic web* fairly radically, by confusing names with types, proper nouns with classes, as discussed in the second chapter of his Knowledge Representation.
    
    XML, as pointed out clearly elsewhere in the thread, is a conventional syntax for the representation of heterogeneous schemata. An XSL stylesheet is a deterministic means of defining the relationship betw
- Re:I don't get it... (Score:2, Insightful)
  
  by jandrieu ( 988307 ) writes:
  
  Your technique hides the semantic data from normal view and forces the author to replicate what they don't want hidden.
  With microformats, the data is presented once, with a few simple tags, and is then available to both HTML viewers/users and semantic parsers.
  - Re:I don't get it... (Score:1)
    
    by mk_is_here ( 912747 ) writes:
    
    Why not XSLT ? Create a self-defined XML, then attach it with a XSLT Template [wikipedia.org] And we dont even need a server side script to make it work!
    - Re:I don't get it... (Score:1)
      
      by jandrieu ( 988307 ) writes:
      
      Your question implies the answer.
      XSLT is a new language to learn. Defining your own XML can be tricky. Integrating it on the server takes some effort and if you want that transform executed on the browser, it certainly will NOT work with as many clients as plain ol' HTML or XHTML.
      Microformats OTH exists as socially defined semantic packages based on real world usage (meaning they've been through the ringer and had the bugs worked out, mostly). The author doesn't have to define their own language or learn
      - Re:I don't get it... (Score:1)
        
        by mk_is_here ( 912747 ) writes:
        
        How tricky it is to self-define a lightweight XML format? Use whatever element you like, and design the data structure on your own that suits you best. Why do we need to design a new language?
        
        How is it different from calling a server to output XML and output HTML/XHTML? Which modern browser today does not support XSLT? Firefox, Internet Explorer? (Yes, Opera will support XSLT 1.0 in the coming version 9)
        
        BTW, There are server-side XSLT processors (for very-old browsers sake). For instance, this [php.net], this [sun.com] and
        
        Re:I don't get it... (Score:2, Insightful)
        
        by jandrieu ( 988307 ) writes:
        
        *Any* design activity is more complicated than copying a proven, open source design. And if you want that design to be understood by someone else, you still need to (correctly) use a common vocabulary.
        
        It is easier to use what you know (HTML+CSS) and rely on the technology you understand (IE/Firefox/etc). That's it. Some people like to play in new techno sandboxes. Others just need to publish their kid's soccer schedule on their webpage and aren't about to read the help files at their ISP--or sourceforge o
  - XML can be styled (Score:2)
    
    by Fastolfe ( 1470 ) writes:
    
    If the parent document is XHTML, and the browser understands that, CSS can easily be used to style these additional non-XHTML elements any way you like.
- Re:I don't get it... (Score:2)
  
  by stonecypher ( 118140 ) writes:
  
  Well, actually it's what XHTML is for - namespaces are just to prevent name conflicts, like namespaces in C++. Sure, XML is for custom markup, but Microformats are about embedding formats, not creating them. It's a subtle, and some would contend, pointless difference; that said, given what you said, I'm willing to bet you'll see the importance.
  
  But, yes, you're right to point out that the buzzword web is reinventing yet another tool needlessly and badly.
  - Re:I don't get it... (Score:2)
    
    by Fastolfe ( 1470 ) writes:
    
    but Microformats are about embedding formats, not creating them
    
    It seems to me that creating them is exactly what this is about. Taking a step back, what they're saying is, "XML is hard. But if you make up a pattern of HTML elements and reserve some class names, programs can parse out information in standard ways."
    
    This is the same problem that XML namespaces were intended to solve. OK, so this works for a handful of "formats". Clever (and planned) use of CSS gets the data displayed and compatible user ag
    - Re:I don't get it... (Score:2)
      
      by stonecypher ( 118140 ) writes:
      
      It seems to me that creating them is exactly what this is about.
      
      Like I said, it's a subtle difference, and I don't expect most people to get it.
      
      Taking a step back, what they're saying is, "XML is hard. But if you make up a pattern of HTML elements and reserve some class names, programs can parse out information in standard ways."
      
      I've never seen anyone say that. Indeed, these are no different than XML itself, and are in fact valid XML. Please show me someone saying the words "XML is hard," or any actual ev
      - Re:I don't get it... (Score:2)
        
        by Fastolfe ( 1470 ) writes:
        
        I do not appreciate the condescending tone. Just because someone disagrees with you does not mean they are not literate or not paying attention to the discussion. Reasonable people can disagree reasonably.
        
        Please show me someone saying the words "XML is hard," or any actual evidence in that direction.
        
        My comment was not intended to be a literal quotation.
        
        From the article:
        You see, for a while now, people have tried to extract structured data from the unstructured Web. You hear glimmers of these when people
History, failures, doomed to repeat (Score:5, Insightful)

by ekhben ( 628371 ) writes: on Tuesday July 11, 2006 @10:20PM (#15702971)

This is a kind of neat idea, except, of course, if I have CSS that does something with, oh, say, a class of "dtstart". Sure, it's easy to recognise that ".vevent > .url > .dtstart" is a microformat data item for an hCalendar, but if I'm already using "dtstart" or "url" regularly in my markup so I can apply styles to those kinds of things, I'm pretty much SOL. Rewrite all your markup and CSS to stop using those names.

There's no namespacing. There's not even an ATTEMPT at namespacing. This will fast become an unmanageable hodge-podge of insanity, with common words used willy-nilly in class attributes.

The class attribute is defined as CDATA. That's it. You can use pretty much ANY character in it. There's a lot of characters that can't be used in a CSS selector, though, such as ":". See where I'm going with this? <div class="mf:vevent"> for a start. Better yet, <div class="hidden mf:vevent"> such that you can hide (or format) the block of data separately.

Now, as if that wasn't bad enough, and, trust me, it IS bad enough, there's also the misuse of the "title" attribute and the "abbr" element. A machine formatted date is not the expanded version of a human formatted date, which is not an abbreviation. A renderer trying to make sense of <abbr class="dtstart" title="10034134134T00">17th Smarch</abbr> will think "AHA! This here is an abbreviation, I will provide unto the user some means to see what that '17th Smarch' abbrevation stands for!" Usability disasters follow.

So, in summary, this is the worst idea I've seen in HTML space since some bright spark said, "let's suggest that people use the 'text/html' content type for their XHTML markup!"

Share
twitter facebook
- Re:History, failures, doomed to repeat (Score:1)
  
  by thePowerOfGrayskull ( 905905 ) writes:
  
  There's no namespacing. There's not even an ATTEMPT at namespacing. This will fast become an unmanageable hodge-podge of insanity, with common words used willy-nilly in class attributes.
  Human sacrifice, dogs and cats living together -- mass hysteria!
- Re:History, failures, doomed to repeat (Score:1)
  
  by jt2190 ( 645297 ) writes:
  
  This is a kind of neat idea, except, of course, if I have CSS that does something with, oh, say, a class of "dtstart". Sure, it's easy to recognise that ".vevent > .url > .dtstart" is a microformat data item for an hCalendar, but if I'm already using "dtstart" or "url" regularly in my markup so I can apply styles to those kinds of things, I'm pretty much SOL.
  Not necessarily. If the existing style rules don't look ugly when applied to the microformat then no problem. Otherwise, do exactly
HoTMetaL (Score:3, Insightful)

by Doc Ruby ( 173196 ) writes: on Tuesday July 11, 2006 @10:45PM (#15703063) Homepage Journal

And I think that muddling data and presentation without explicit distinction is exactly what was wrong with HTML. Which we just spent a decade slightly recovering from. I guess IBM has made a lot of money on crappy tools, good tools to extract data from crappy data, and extra money for doing it right.

Share
twitter facebook
Pingerati from Technorati (Score:2)

by otisg ( 92803 ) writes:

The VERY relevant site that Jack Herrington forgot to mention there is Pingerati [pingerati.net]. That is THE site through which all these Microformats are shared. The system is based on pings, much like the rest of the blogosphere. Both Pingerati [pingerati.net] and Microformats [microformats.org] have a major force behind it - Technorati [technorati.com].
hResume and Emurse.com (Score:2)

by arudloff ( 564805 ) * writes:

We're looking to implement hResume on Emurse.com [emurse.com] web resumes here in the next couple of days.
I'm really excited about being able to push the standard some. We've been wondering what the effects of it could be negatively though, in terms of screen scrapers (alex.emurse.com, for instance). Any one have any thoughts?
We've built hResume support to be configurable by the user, if it proves to be an issue. Just wondering how we should initially offer it.
I Was Going To Say... (Score:4, Interesting)

by Carcass666 ( 539381 ) writes: on Wednesday July 12, 2006 @12:51AM (#15703437)

I was going to say "I Don't Get It" but somebody beat me to it.

I think the title of TFA "Separate data and formatting with microformats" is a bit ironic since it's about wedging your data into a web page in such a fashion that somebody might be able to pull it back out.

If you want to make your data available there are all sorts of standard and more efficient ways of doing it than embedding it in the presentation layer. If somebody is going to all the trouble to create a parseable human-readable page, why wouldn't they go to about the same amount of trouble and make a far more efficient and standard RSS feed? What about the buzzword of the last few years, SOAP? Hell, what about XML?

From TFA:
How great is that? I have one script that reads a page with calendar items and exports it as XML. Then, I have another page that turns that XML back into calendar items. The original script can then read that page and come out with the same data. It's definitely a circular action.
Okay, maybe it's not that great.

I agree. This reminds me of the lame number tricks where you have somebody pick a number, add something, multiply it by something, blah blah blah, you take the result, divide it by 7 and then you give them their orignal number because you had it all set up ahead of time. If they screw up in their calculations, the trick doesn't work. In this thing, if you screw up embedding the text within the HTML (plenty of ways to do that), the trick doesn't work - and doesn't accomplish much even if it does.

Share
twitter facebook
JSON (Javascript over the wire) (Score:2, Informative)

by c0d3r ( 156687 ) writes:

Look into JSON..its basically javascript data structures that you eval on the client. Why bother assembling thick XML that needs to be parsed on the client. XML is slow, and even slower if you have to XSLT it out of the XHTML.
We have this, only IE does not support it. (Score:2)

by Jerk City Troll ( 661616 ) writes:

It appears you were thinking about the data URI scheme [wikipedia.org]. Unfortunately, and very much like modern CSS standards, the only browser to not support it is the one with the greatest market share.
I don't see how this is better than XML/XSLT. (Score:2)

by poot_rootbeer ( 188613 ) writes:

This "Microformatting" concept is predicated on the idea that data is (or should be) human-readable in its default state, but with mechanisms that make it easier to translate it into something machine-readable. This seems backwards to me.

Humans only need to be able to comprehend the data structure at two points: input and output. In between, computers may perform a thousand different transfers and transformations on the data, and at those points, the ability to see the data in plain English (or plain Anyo
- Re:I don't see how this is better than XML/XSLT. (Score:1)
  
  by MichaelMD ( 988372 ) writes:
  
  >This is not to be encouraged.
  
  so if you had your way we wouldn't have search engines like google, etc either?

There may be more comments in this discussion. Without JavaScript enabled, you might want to turn on Classic Discussion System in your preferences instead.

Geez, man... (Score:3, Insightful)

Re:Geez, man... (Score:2, Informative)

Wheel of re-incarnation strikes again... (Score:2, Informative)

Re:Geez, man... (Score:4, Insightful)

META headers (Score:1, Interesting)

Re:META headers (Score:4, Informative)

Re:META headers (Score:1, Funny)

Re:META headers (Score:2)

Re:META headers (Score:5, Informative)

Re:META headers (Score:2)

Re:META headers (Score:3, Informative)

Re:META headers (Score:2)

Re:META headers (Score:1)

Re:META headers (Score:2)

Re:META headers (Score:2)

Re:META headers (Score:2)

Firefox (Score:1)

Re:Firefox (Score:2, Funny)

Re:Firefox (Score:1)

Re:Firefox (Score:2)

Re:Firefox (Score:2)

Tagging in Text (Score:1, Informative)

Re:Tagging in Text (Score:2, Insightful)

Re:Tagging in Text (Score:5, Informative)

Re:Tagging in Text (Score:2)

a standard that people are ACTUALLY USING! (Score:1)

Re:Tagging in Text (Score:3, Insightful)

LISP (Score:5, Insightful)

Re:LISP (Score:5, Funny)

Re:LISP (Score:3, Interesting)

Re:LISP (Score:2)

Re:LISP (Score:1)

Re:LISP (Score:1, Troll)

Re:LISP (Score:3, Interesting)

And he asked for a wake-up call . . . (Score:2)

sufficiently complicated (Score:2)

Standardization is the problem (Score:5, Insightful)

Re:Standardization is the problem (Score:5, Insightful)

Re:Standardization is the problem (Score:1)

Re:Standardization is the problem (Score:2, Insightful)

Re:Standardization is the problem (Score:2)

OK. Who else... (Score:3, Funny)

Re:OK. Who else... (Score:2)

Re:Standardization is the problem (Score:5, Informative)

Re:Standardization is the problem (Score:2)

Re:Standardization is the problem (Score:1)

Re: Standardization is the problem (Score:2)

Re: Standardization is the problem (Score:1)

Re: Standardization is the problem (Score:2)

Re:Standardization is the problem (Score:4, Insightful)

Re:Standardization is the problem (Score:3, Insightful)

Re:Standardization is the problem (Score:2)

Re:Standardization is the problem (Score:2)

Re:Standardization is the problem (Score:1)

Re:Standardization is the problem (Score:2)

Re:Standardization is the problem (Score:2)

Re:Standardization is the problem (Score:2)

I don't get it... (Score:5, Insightful)

Re:I don't get it... (Score:5, Informative)

I'll get it. (Score:2)

Re:I don't get it... (Score:2, Insightful)

Re:I don't get it... (Score:1)

Re:I don't get it... (Score:1)

Re:I don't get it... (Score:1)

Re:I don't get it... (Score:2, Insightful)

XML can be styled (Score:2)

Re:I don't get it... (Score:2)

Re:I don't get it... (Score:2)

Re:I don't get it... (Score:2)

Re:I don't get it... (Score:2)

History, failures, doomed to repeat (Score:5, Insightful)

Re:History, failures, doomed to repeat (Score:1)

Re:History, failures, doomed to repeat (Score:1)

HoTMetaL (Score:3, Insightful)

Pingerati from Technorati (Score:2)

hResume and Emurse.com (Score:2)

I Was Going To Say... (Score:4, Interesting)

JSON (Javascript over the wire) (Score:2, Informative)

We have this, only IE does not support it. (Score:2)

I don't see how this is better than XML/XSLT. (Score:2)