Want to read Slashdot from your mobile device? Point it at m.slashdot.org and keep reading!


Forgot your password?
Back for a limited time - Get 15% off sitewide on Slashdot Deals with coupon code "BLACKFRIDAY" (some exclusions apply)". ×

Journal Jack William Bell's Journal: Why RDF is hard 15

Dave Winer has some problems with RDF. He goes into some detail why, but it basically boils down to "RDF is too complicated."

I have been struggling with RDF myself, because of my peripheral involvement in the Chandler project which uses RDF extensively. And I am beginning to think the real problem is that RDF is too simple...

How do I explain this? Well, RDF is made up of only a few rules. You have statements consisting of a Subject, a Predicate and an Object, along the lines of "Charlie is the father of George." You can make multiple statements that connect via Subject or Object Nodes, so I can add that "George owns the Hotdog Hut." "The Hotdog Hut is in competition with the Burger Palace." "The Burger Palace is managed by Charlie."

There, four statements that encapsulate an awful lot of information about Charlie, George and their relationships. Add to this that I can tack on other information to each of those statements, like for example the menus of the Hotdog Hut and the Burger Palace. And I can group statements such that I can say "Charlie is the father of George, Martha and Peter." What is more, I can make 'statements about statements' like "Jack said all of these things."

Still with me? OK, from a purely lexical point of view RDF is simply a formalization of normal English sentance construction rules. In other words it is a way of describing things that has much of the power of the English language. Thought about in those terms you can begin to grasp the range of things you can do with RDF, but you will never be able to apply it!

However, in a mathematical sense what we are referring to here are Nodes and Edges in a Network Graph. In this case 'Nodes' are points of reference (for RDF they are URIs like 'http://www.google.com/') and 'Edges' are the predicates that define how Nodes are related. All this is done with a simple syntax that can be (but does not have to be) represented in XML. Anyone who has some grasp of the numerical theory behind network graphs can apply RDF, but even they are going to have some difficulty conceptualizing it.

Why is this? Didn't I start out by saying it was simple? Well, you see the problem is that very complex scenarios can emerge out of the interaction of a few simple rules. This concept of Emergent Behavior is a hot topic right now because it promises to finally explain (among other things) how our very minds can operate in a meat matrix of simple cells with simple connections. The trick is that you need lots of those simple cells.

RDF is an emergent system consisting of a network graph realization framework made out of a few simple rules. The interaction of those rules open up an entire universe of possibilities and it is difficult to get your head around an entire universe. Of course, so long as you are only dealing with a few things (Charlie and his son George) you are OK. But the minute you bring in the rest of the town things get ugly fast. RDF is designed to let you model the relationships of the entire town. That is why RDF is hard.

This discussion has been archived. No new comments can be posted.

Why RDF is hard

Comments Filter:
  • There is an ongoing debate on Burningbird [burningbird.net] regarding the rather difficult syntax of the RDF XML spec. This is actually another issue than the conceptual space of RDF I refer to in the entry above.

    Of course I stepped into that fight as well. :-) Go there to read the comments...
  • I agree with your points regarding emergence, but not the conclusion - "...the minute you bring in the rest of the town things get ugly fast."
    Why 'ugly' and not just 'interesting'?

    My own response to Dave Winer's "I can't do it so it must be hard" statement can be found here :
    RDF in 500 Words [citnames.com]

    • Ugly, I guess, in the sense that you have a spagetti-mass of links running everywhere. Too much to comprehend just looking at it. Not pretty and certainly not easy for me to make use of.

      Interesting? Sure, the web itself is just such a mess of links and I certainly find it interesting. But I need tools to make the complexity easier to deal with, things like search engines, portals and blogs. We need tools like this for RDF-based systems, but we don't have the RDF-based systems yet (to speak of). So we have a chicken and egg problem.
      • I think the chicken and egg stage is passing now there's loads more RDF coming online, e.g Adobe's now using it, figure suggested is about 10 million documents produced by Acrobat, Photoshop etc will contain RDF by the end of the year. Time to get building those apps!
        • Time to get building those apps? Hoo boy! That is both a fantastic 'double dare you' and a mind boggler when you start asking yourself 'what apps'. Right this second I am drawing a blank.

          But then most of my thinking in this area has been turned to storage and searching lately. I have this old idea for an object database that, even 10 years ago, included something very close to minimal RDF. Lately I have been mulling over just how it would fit in...
          • heh - I hadn't actually intended that as a dare, but I'll sure let it rest as one ;-) I'm certain there's *loads* that can be done in the storage & searching domain. Go Jack, go! Oddly enough this is really what led me to look at RDF - I'd written a search engine and I wanted some kind of catagorizing facility (never implemented) which led me through dmoz.org to RDF.
  • I thinkk part of Dave's problem with understanding RDF has to do with what he has been exposed to and what he has not been exposed to..

    Those of us who were exposed to case tools, uml, or any graphical tool that described complex behaviour and relationships (biology majors its stella)understand RDf becasue we have seen a similar framework in action and know its power..

    To be perfectly fair to anyone .. someone may not understand quantum physics..but the sun still goes through its quantum and other interactions wherether you understand them or not..

    To say that it must not have meaning because I don't unerstand it is just plain pigheaded ignorance..

    The semantic web will be emeging whether Dave understands it or not..

    • I dunno. Dave Winer is a lot of things, but he certainly isn't stupid. He certainly knows enough math to comprehend RDF, but that isn't his real problem.

      Dave's issue is that it is needlessly complex. He thinks you should be able to Grok it immediately or there is something wrong with the basic idea. So, in his world, you are better off to just use something more concrete to a particular problem, for example RSS [userland.com]. RSS might be problem specific, but Dave would say that is a good thing.

      He might be right about this on some levels, take a look at his XML-RPC spec [xmlrpc.com] and then compare it to the SOAP spec [w3.org]. Both do similar things and probably 80% of SOAP apps don't really do anything XML-RPC can't provide. But most of the SOAP protocol is a confusingly over-engineered mess (IMHO).

      However I would argue that RDF is an excellent intermediate-level abstraction and useful for modelling certain kinds of otherwise difficult to model problems. The real question to ask is this: Is RDF over-engineered or is it elegant? And I don't know the answer to this yet. Like computers themselves, RDF is bigger on the inside than it is on the outside. And I am still learning my way 'round.
  • I haven't had as much time to play with RDF as I'd like, and I keep hearing this triplet notion bantered about but on naive inspection I see elements with many elements from many namespaces embedded which means to me that each triplet has a potential parent container as well as its noun verb object but given that it's a tupple of whatever sort, I'm surprised we don't see more predicate calculus applications based on RDF: It would seem trivial to map the triplets into, say, prolog facts, and then create data-mining applications out of inductive rules over those facts.

    Does anyone know of any RDF/Prolog apps?
    • Prolog! [shudder]

      I don't know of any RDF/Prolog apps, but I am fairly new at this stuff. I agree that it seems a natural fit, what with predicates, triples and all. But I never did get my head all the way around Prolog, though I tried for a while. I think Prolog (like Lisp) is one of the very few programming languages you actually learn more easily in a classroom than you do by just writing code with it. Of course your mileage may differ...
      • Seems I should learn to Google first, ask questions later: Mozilla RDF / Enabling Inference [mozilla.org] is just one of many links to surface from a simple search "prolog rdf".

        As for learning Prolog, it's easier than you might think: The trick is to think of statements where the RHS defines the truth of the LHS, so you're thinking of the program backwards from the solution to the list of sub-solutions reduced recursively until the sub-solutions are trivial facts. The only real obstacle to learning prolog is to jump in on a mature project like SWI and be faced with all the various hacks that go toward making pure-logic into a practical programming tool. My advice: forget interfaces until you have the logic worked out.
    • The RDF academics are are knee-deep in predicate (and other) calculus stuff (check out the model & theory doc in the specs), but engineers are also dipping their toes in. Re. Prolog + RDF, Bijan Parsia's written about this at xml.com : http://www.xml.com/pub/au/93

Never say you know a man until you have divided an inheritance with him.