Forgot your password?
typodupeerror

Is The Semantic Web A Pipe Dream? 16

Posted by Cliff
from the stuff-to-think-about dept.
wdebruij asks: "I'm currently writing a small program for sharing information over the internet. For categorizing and indexing this information I want to use RDF and the semantic web as described by the WWW consortium, but since the documentation says nothing about a standard dictionary I seriously doubt we will ever have such a general information index. The Open Directory Project has written it's directory in RDF, but does anyone know of another 'standard' dictionary?" The whole point behind the "semantic web" concept is that data is organized online in such a manner, that a variety of different, independently designed machines can use it without compatibility issues.
This discussion has been archived. No new comments can be posted.

Is The Semantic Web A Pipe Dream?

Comments Filter:
  • (Forgive the anonymous coward - my real info is below)

    Don't forget the WEB in semantic web. In my vision (and I've written about it in several places including here. [umd.edu]) users create their own term libraries (or, more precisely, ontologies) either from scratch or by extending existing ones. These can be networked together by including terms from other ontologies via explicit URI use, can include declaring equivalences between terms (i.e. xxx:foo :equiv yyy:bar), including ontologies in others, and similar such linking. This creates a wide, complex, ugly graph of inconsistent knowledge -- hey, that's what we have on the regular web!!

    This can really work -- we've been playing with ontologies on the web for a while now, and if RDF terms get anchored in ontologies, and these get linked together, a powerful web of semantics can evolve.

    Don't believe it can happen? DARPA (remember the ARPAnet? That was us) is supporting an effort, in close cooperation with the European Union Semantic Web effort, to bring this to fruition. To learn more about this effort, or to be able to try it yourself, check out the DARPA Agent Markup Language [daml.org]

    Jim Hendler
    Chief Scientist, Information Systems Office
    DARPA
    jhendler@darpa.mil

  • by Anonymous Coward

    You sure are right nobody wants to use Dewey or LOC.

    Dewey is AWFUL for a modern range of information. Go to your local public library and look in the stacks for books on computers. 004 & 005. That was the only area they could shoehorn this unforseen-by-Melvil topic. In many suburban libraries, th0se two numbers may be as large as the entire 200 section (religion).

    Surprisingly, the resistance to LOC seems to be that most people are more used to Dewey.

  • Adoption of new W3C standards is slow and, normally, non-existant. This frustrates the W3C, academics, and many who wish to profit from the new standards. The semantic web, I believe, is no different. The vast majority of the web is created by two groups: amateurs who use the web to give information to others and corporations who use the web to futher their business strategy. Neither has a real incentive to adopt new standards.

    Amateurs simply want to share information. They have at most a few hours a week and generally a very limited knowledge of HTML. Their HTML is generally rooted in version 3. They don't use version 4. They don't use DHTML or XHTML. They don't care. They care about sharing information, not the underlying technology. Learning HTML is a necessary evil to get a web page created. It is not an end in itself. How will a semantic web help the amateur enough to overcome their own laziness? Why should I spend even two hours learning the new semantic markups or finding out which terms in the ontology match my web page? I just want my friends to see pictures of my wedding!

    For corporations, you have an even harder sell. The cost of training existing web page designers, hiring consultants, and retrofitting a normally large existing web site must be justified. The company must get something in return. But, the value of a semantic web is geometrically proportional to the number of sites that are a part of the semantic web.

    The network penetration paradox says that you must have a certain amount of value to a network before people percieve the value to themselves. Until you reach that amount of value, people won't join. But, the only way to reach that value is to have users. So, how do you jumpstart a new network? Lower the barrier to entry or subsidize new users. Neither of these is likely to occur with a semantic web.

    Lowering the barrier to entry entails reducing the cost of adoption. For a semantic web, this means that the new users shouldn't be subjected to seemingly endless debate about ontologies. Since ontologies are the basis of a semantic web, there will be endless debate. Debate about an ontology is inherent in ontology design because each user (person or corporation) has a different ontological model.

    Alternatively, you could subsidize early adopters. This is unlikely since it isn't clear who would have an economic incentive to subsidize users. W3C can't afford it. The government doesn't care. Corporations can't control usage of the internet to reduce the risk of subsidy.

    No, I don't think the semantic web will happen any time soon. If it does happen, it will look very different than our conceptions of what a semantic web is.

    Dave
  • When language first developed in our species there were no endless debates about ontologies ...
    Well, we don't really know that. Maybe there was; the world may never find out. But we do know the evolution of languages has left us with several hundred languages. Some are more similar than others. Allowing this evolutionary process on the web would create hundreds of little islands of ontological similarity.

    I've been involved in creating ontologies. Top down ontologies work in situations where the either the ontology already exists (but has not been formally stated) or with a very small group of people. Bottom up ontologies work for some things.

    It sounds like you are proposing a bottom up ontology creation process. Well, the web already has that for pages. They're called META tags. We have ALT tags for text and images. There is semantics in HTML. They are primitive, but flexible enough to allow evolution. These tags are currently used by search and indexing engines. But, they mean different things to different people.

    Differences are fine for people. As the diagram you linked to shows, we have a personal semantic memory. I'd personally call that a personal semantic context, but the idea seems to be the same (or do we have an ontological conflict?). For value to be given to most people and corporations, the semantics must be machine parsable. They must give added value to machine and automated interaction. But, computers don't have a personal semantic memory. Despite AI advances, computers can not infer based on previous experience in any broad application. Even neural nets are limited to very strictly defined domains. A bottom up approach to ontology creation does not leave an ontology that is easily machine parsable.

    My main point was one of adoption. People and corporations need incentive to make changes. People hate change and corporations see change as risky. People won't invest in learning and changing without believing that it will help them significantly. Without the added value of making the web more automated for "intelligent agents", businesses do not have an incentive to take on the risk of change. Even if they did, they would be opening up their websites to more automation. Many corporation do not see that as a good thing. Look at legal challenges to deep linking or their reaction to competitive bidding sites.

    Automation of the process to reduce the cost of adopting new standards is always going to help. But, software makers (including myself) are notoriously bad about making complex user decisions simple for the end user.

    Dave
  • Seems like a lot of FUD - all we might be talking about to start with is a bit of extra metadata (in your META tags) that describes a few of your company's main pages, using Dublin Core vocabulary; the 'risk' is negligible.
    When I said risk, I did not mean strategic risk. I meant economic risk. Adding new technology requires an outlay of resources. Resources should only be expended if the reward is greater than the cost. But, there is no way to ensure that companies will get any rewards that they percieve. That is risk. Economic risk is like saying "What is the chance that we will recoup our costs?" Almost every business decision involves risk. Smart businesses reduce their risk through control, knowledge, expertise, insurance, etc.. To better facilitate adoption, you need to reduce the barrier to entry, increase the rewards, or reduce a company's risk. The W3C has been notorious for ignoring the incentive to adopt new standards.

    Dave
  • by BobGregg (89162) on Wednesday March 21, 2001 @10:53AM (#349492) Homepage
    If you're looking for a standard "vocabulary" to use in the context of RDF, W3's RDF FAQ [w3.org] has a link [ukoln.ac.uk] to suggestions about how to implement the Dublin Core [dublincore.org] tags via RDF. For a more specific and extensive vocabulary, you're probably right - there's very little agreement about what sort of standard to use. It's kind of ironic actually; libraries have been using one of two different organizational systems (Dewey or LOC) for roughly a century, either of which seems like it would lend itself handily to indexing the web topically. Yet in the quickest-growing body of knowledge on the planet, nobody wants either of those, and nobody seems to be able to agree on anything new either.
  • by aswartz (126863) on Wednesday March 21, 2001 @11:11AM (#349493)
    I feel sort of bad plugging my own group, but this is exactly the problem that SWAG [purl.org] is meant to solve. SWAG is the Semantic Web Agreement Group, and we bring different users of the SWeb together to try and build sets of common terms. Our current project is to build a dictionary of common terms, which you can find at: WebNS.net [webns.net].

    Obviously, the Semantic Web won't work if we only have one dictionary, but it will work much better if agree on the terms we use when possible. So SWAG isn't trying to enforce terms, but merely recommend them.

    We work on a process of consensus [purl.org] so that we can move quite quickly and new terms don't get bogged down in endless talking.

    So I hope you'll visit us, once again the address is: http://purl.org/swag/ [purl.org].

  • You cannot organize Chaos, order can emerge from Chaos but it is genrally a very localized and temporary thing. Like it or leave it.
    --
  • "you'd have to get users, most of whom are wholly uninterested in finding web resources, to use the Semantic web system."

    Aren't most web users interested in finding web resources? That's why they spend hours using search engines...

    "At this point in time it would be practically impossible to backtrack to a systematically laid out web."

    I don't think anybody in semantic web research is suggesting this...

    "Probably the best you could do is simulate one using a search engine that constantly sought and categorized pages intelligently."

    This is more what people are suggesting; by defining and publishing terms (and making them machine parseable), equivalences between ontologies can be identified and applied automatically, along with other machine reasoning.

    "Even then, you'd have to convince people to use your search engine, which, unless you really provide a superior one like Google or a deep catalog like Yahoo!, would be pretty hard to do."

    Not a problem if Google and Yahoo start to make use of the semantic web technology for their searching and indexing...

    David Allsopp.

  • When language first developed in our species there were no endless debates about ontologies ...

    Well, we don't really know that. Maybe there was; the world may never find out. But we do know the evolution of languages has left us with several hundred languages. Some are more similar than others. Allowing this evolutionary process on the web would create hundreds of little islands of ontological similarity.

    We translate between those human languages all the time though, don't we? We can't automate that very well because human language is complex; the semantic web languages aim to be machine-readable though.

    My main point was one of adoption. People and corporations need incentive to make changes. People hate change and corporations see change as risky. People won't invest in learning and changing without believing that it will help them significantly. Without the added value of making the web more automated for "intelligent agents", businesses do not have an incentive to take on the risk of change. Even if they did, they would be opening up their websites to more automation. Many corporation do not see that as a good thing. Look at legal challenges to deep linking or their reaction to competitive bidding sites.

    Seems like a lot of FUD - all we might be talking about to start with is a bit of extra metadata (in your META tags) that describes a few of your company's main pages, using Dublin Core vocabulary; the 'risk' is negligible.

    I agree though, that well-designed automated tools are a challenge, and will be needed to gain widespread use.

    If this technology gains some acceptance, people will adopt it for fear of being left out in the cold by the search engines (I hear managers saying to their techies "We need to be semantically enabled - I don't know what that means but I want it!")

    David Allsopp.

  • How long will this classification take and how long would any information be good?

    How long did it take for everybody to put in META tags and register their pages regularly with search engines? It's a gradual continuous process, not all-or-nothing.

    Joe Blow over at Geocities just put up his Britney Spears page. One day he may find Bjork his muse. He sure isn't going to inform some search engine of his updates. How long before the search engine comes back around to check his page out? Anything can be outdated by the time you find it in a search engine, which is not so different a situation than exists now,

    Exactly - the situation isn't any different as far as registering the information goes. The difference is in the quality of the information, and the fact that machines can do something useful with it, and that we can now tell easily that the Spears page is about Britney, not about primitive weaponry.

    but no one is trying to put any handcuffs on web publishers now.

    What handcuffs are you referring to? Nobody controls the web, you know that - you can publish your pages however you want.

    If we can create better ways then you're free to use them or not, but better ways will mean more of the right people find your site.

  • When language first developed in our species there were no endless debates about ontologies ... no people just started using words and they ~magically~ started meaning approximately the same thing to different people. Me thinks the semantic web will develope in just such a manner. All the semantic web needs to develop is some user friendly tools that allow people to do what they already want to do: 1) find things 2) publicize their own things that they want others to find. These are sufficiently compelling reasons for the sem web to develope ... all we need is the tools to make it so.

    Here is a diagram you might find pertinant to this discussion Coherant experience on the semantic web, and why we need personal semantic memory. [robustai.net]

    PS: If anyone wants to help with the effort of creating these tools, please send me an email seth@robustai.net [mailto]

  • I see the semantic Web as a matter of When and not 'if'. I have been working on the Web since 1992 when I joined Tim at CERN to work on the Web. That is nine years ago folks. The full semantic Web could take as long again, perhaps twenty.

    Worst thing that happened to the Web was the false expectations set by 'internet time'.

    If on the other hand people want to look at a real world application of an assertion scheme see my work on XTASS [xmltrustcenter.org], follow the links through to XTASS. What we are planning to do is to apply assertion markup to address real world, near term business problems.

    The Security Technical Committee of OASIS is performing work that uses a lot of the XTASS specification. If I wasn't flaming on Slashdot now I would be writing the specification. The OASIS group is meant to be handing in a deliverable by July (yes really), now it may be late 2002 before Security Assertion Markup Language is a reality but there is major momuentum behind it.

    What Tim and Ralf are working on is basic research. They have a lot of flexibility, they can fail for example. What I am working on is solving real problems for an early adopter community.

    If we get enough early adopter communites together we might see a more general semantic framework by mid decade. I suspect that this will require some additional work on the natural language problem as a means of filling out the initial RDF frames and on the collaboration problem as a means of refining them.

    I helped to put together the coalition that established critical mass for the Web. I think that there is a good chance we can do a repeat here. OK so the semantic web is a vastly harder business than the document web. However when we got the Web off the ground Tim could not get a paper published as a poster at the hypertext coinferences, now he is the keynote speaker.

    The point about the 'web' in semantic web is key. I think that the problem with many ontology programs was that they started from the concept of a central server.

    Also the use of the word 'ontology' is bogus. What is being talked about is not an ontology it is a vocabluary of shared terms and not a system of being. The confusion comes from misreadings of Heidegger. What we need is not an Ontology but an intersubjective shared vocabulary, folk need to read Habbermas (and then explain it to me please:-).

    If people thought the semantic web was possible it would not be worth doing.

  • Aren't most web users interested in finding web resources?

    I'd have to say no. Yes, you and I are interested in finding things on the web. We use search engines to find these things. The average web user (I'll use my mom as an example) can't tell the difference between Internet Explorer, Earthlink, Yahoo!, Hotmail, and the Internet. For her, these things are all equivalent. She is interested in "going" to places on the web, but she is pretty much restricted to those places that she can link to from Yahoo! or hear from her friends.

    I believe that the majority of users are just like her. As a result, web communities are able form at places like The Motley Fool [fool.com] and the message board at Yahoo! [yahoo.com]. Users find a little place they like to frequent, put it in their favorites (because they don't know how to make it their startup page) and never venture very far. For these people (the majority of users) a semantic web would be completely useless.

    Probably the best you could do is simulate one using a search engine that constantly sought and categorized pages intelligently.

    This is more what people are suggesting; by defining and publishing terms (and making them machine parseable), equivalences between ontologies can be identified and applied automatically, along with other machine reasoning.

    How long will this classification take and how long would any information be good? Joe Blow over at Geocities just put up his Britney Spears page. One day he may find Bjork his muse. He sure isn't going to inform some search engine of his updates. How long before the search engine comes back around to check his page out? Anything can be outdated by the time you find it in a search engine, which is not so different a situation than exists now, but no one is trying to put any handcuffs on web publishers now.

    Dancin Santa
  • There are a zillion different things going on on the web every moment. Even if you were to successfully implement your program, you'd have to get users, most of whom are wholly uninterested in finding web resources, to use the Semantic web system. Essentially, you'd become just another niche in an infinite universe of niches.

    At this point in time it would be practically impossible to backtrack to a systematically laid out web. Probably the best you could do is simulate one using a search engine that constantly sought and categorized pages intelligently. Even then, you'd have to convince people to use your search engine, which, unless you really provide a superior one like Google [google.com] or a deep catalog like Yahoo! [yahoo.com], would be pretty hard to do.

    Reminds me of a story...

    Tim was a software designer, he designed the coolest technology and its use spread to all corners of the world. People used his technology to share ideas and conduct business. With different people pulling his technology in a hundred directions at once, his technology grew and became chaotic and beautiful. Unfortunately, the information was becoming factioned and harder to find for Tim.

    To seek a remedy he went to the council of shamans and asked for help. "The solution to your information problem is easy. You only need to apply a systematic storage structure to your technology," said the council chief. The chief waved a bloody chicken over Tim's head and spat on him several times then handed him a goosedown pillow. "Take this pillow to the top of Mt. Shasta. Once there, shake out all the feathers into the wind. Return with your results."

    Tim left and did as the shaman said and returned with the empty pillowcase. "I emptied the pillow of feathers, but I still don't understand what this has to do with applying a systematic storage structure to my technology," he said to the shamans.

    "Then go back to the top of the mountain and put all the feathers back into the pillowcase."


    Dancin Santa
  • by Edd_Dumbill (410700) on Wednesday March 21, 2001 @11:24AM (#349502) Homepage

    As the Semantic Web is a layered framework, the actual vocabularies you use to describe things are applications of the framework rather than the framework itself.

    One such application that might prove useful in what you're tackling is RSS [purl.org]. What you seem to be looking for is a taxonomy against which you can classify things. These are expensive to develop and hence rare, the ODP being one of the few that are public. My advice is, if the ODP doesn't fit, classify by topic yourself (but avoid the mistake of struggling to produce a hierarchical system, this is rarely appropriate). At a later stage you can express equivalences to other folks' categories. Folks on the RSS-DEV mailing list would be happy to share experience of categorization.

    Anyone seeking more information as to what the Semantic Web actually is and how it fits together might be interested in some of the articles I've written on the topic, which give an overview both of the vision and of ways you can get started with tools:

    -- Edd Dumbill, Editor, XML.com [xml.com].

Never trust a computer you can't repair yourself.

Working...