Follow Slashdot blog updates by subscribing to our blog RSS feed

 



Forgot your password?
typodupeerror
Check out the new SourceForge HTML5 internet speed test! No Flash necessary and runs on all devices. ×
The Internet

Going from a 'Web of links' to a 'Web of meaning' 142

neutron_p writes "Computer scientists from Lehigh University are building the Semantic Web, which will handle more data, resolve contradictions and draw inferences from users' queries. The new improved Web will also combine pieces of information from multiple sites in order to find answers to questions."
This discussion has been archived. No new comments can be posted.

Going from a 'Web of links' to a 'Web of meaning'

Comments Filter:
  • When (Score:2, Insightful)

    by cbrocious ( 764766 )
    When will we be dropping HTTP and HTML in favor of more metadata-friendly protocols and file formats? I can see huge potential in a system built specifically for getting data out there and linking it all together.
    • Re:When (Score:1, Insightful)

      What on earth makes anyone think a computer has anything to do with "meaning"? That would require understanding, which would require thought, which would require consciousness. Which is exactly what a machine explicitly does NOT have.
      • Re: When (Score:2, Funny)

        by vasubhat ( 733530 )
        ... And when the computer does indeed possess understanding, thought, consciousness and the likes, and goes about doing something with it, the Vogons go and destroy it before the job is done.
    • Unless (Score:4, Interesting)

      by Taco Cowboy ( 5327 ) on Saturday October 09, 2004 @10:43AM (#10479343) Journal
      You gotta understand that "meaning" has no meaning at all to machines, at least not yet.

      And even for humans, the "meaning" of a certain thing can be different thing to different people !

      Although I applaud the job they are doing for Semantic Web, I wonder how they can inject "meaning" into the whole thing.

      My biggest fear is the 1984-like "my meaning is THE meaning and you canna have any other meaning" thing.
      • by LionKimbro ( 200000 ) on Saturday October 09, 2004 @11:33AM (#10479643) Homepage
        A message has "meaning" if you can make special use of it.

        Normal web pages have meaning for browsers, it's just that that meaning is limited to "how to draw words for the user."

        What we're doing, is making it so that your computer can make special use of messages on the web, to do smarter things.

        It would be scary if the Semantic Web [w3.org] were about "my meaning is THE meaning." But it is explicitely not like that. In fact, one of the main things about it is that anyone can make up their own languages, their own way of modelling the world.

        There are tools [w3.org] that make it so you can say, "My word X is sort of like their word Y," but it's acknowledged that such translations will be imperfect. Likely, fuzzy logic, and systems that are able to ask for clarification (and remember responses), will be used to mediate that sort of things.

        You may also be interested in my favorite page on AI by Open Mind. [kurzweilai.net] The Semantic Web isn't explicitely about AI, but it opens the door for a lot of AI work.

      • thats just what the computers want you to think...
    • >When will we be dropping HTTP and HTML in favor of more
      >metadata-friendly protocols and file formats?
      When IPv6 is fully deployed, and the US got a black female president that
      just invented cold fusion.
    • "When will we be dropping HTTP and HTML in favor of more metadata-friendly protocols and file formats?"

      When google-spammers stop putting 8 million irrelevant words in their
      <meta name="" content="">
      tags?
  • Ummm (Score:4, Insightful)

    by bo0ork ( 698470 ) on Saturday October 09, 2004 @10:29AM (#10479251)
    The new improved Web will also combine pieces of information from multiple sites in order to find answers to questions

    Sounds like a recipe for disaster to me.

    • Re:Ummm (Score:2, Funny)

      by amalcon ( 472105 )
      Yeah, hopefully there aren't many easily-offended cat enthusiasts out there. They might not appreciate some of the more, er, "exotic" sites they find...
    • Re:Ummm (Score:5, Insightful)

      by bobbis.u ( 703273 ) on Saturday October 09, 2004 @10:49AM (#10479386)
      Yeah, I would tend to agree.


      One of the reasons the internet has become so popular is because everyone can have their say. Unfortunately, this has the side effect that there is a lot of incorrect and misleading information out there. Everything is also self-reinforcing, because one person often copies their "facts" from another website without first checking the veracity. Even major news outlets and scientific publications have been caught out by this in the past.

      • Re:Ummm (Score:5, Insightful)

        by ezzzD55J ( 697465 ) <slashdot5@scum.org> on Saturday October 09, 2004 @11:04AM (#10479487) Homepage
        'Everything is also self-reinforcing, because one person often copies their "facts" from another website without first checking the veracity'

        There is another way in which it's self-reinforcing. People look for sites and pages and people that reflect their own opinions.

      • Re:Ummm (Score:3, Interesting)

        by LionKimbro ( 200000 )
        Two things:
        1. "Webs of trust." People will make pages telling what pages they believe have a good reputation, and generally tells the truth. If someone fills the web with a ton of random statements, they will have a low reputation.
        2. Computers will have "beliefs" reflecting their owner's own. You will tell the computer, "I believe this is true," and the computer will absorb the package of information. You can say, "I believe this is false," and the computer will absorb the package of information, and put it i
        • Re:Ummm (Score:3, Insightful)

          by jsebrech ( 525647 )
          "Webs of trust." People will make pages telling what pages they believe have a good reputation, and generally tells the truth.

          That won't work for stuff that's politically sensitive, since people will mod sites down just because they dislike what the site says, even if it is accurate. It also gets really complicated with sites that are accurate on one subject but don't know jack about another.

          Computers will have "beliefs" reflecting their owner's own.

          In that case, what's the point? If your computer onl
      • by Jerf ( 17166 )
        Unfortunately, this has the side effect that there is a lot of incorrect and misleading information out there. Everything is also self-reinforcing, because one person often copies their "facts" from another website without first checking the veracity.

        I am interested in where you have found a source of information that does not match this description.

        I was not aware extra-terrestrials were running libraries I could get at. Certainly no known human data source has ever risen to this standard.

        (Criticizing
      • Exactly.

        One of the biggest stumbling blocks of the semantic web (semantic anything, in fact) is Trust: how do you know the other guy is telling the truth? Human beings are very good at evaluating trustworthiness from a website, but when we switch to a web made for understanding by machines, we lose that ability. We need some kind of trust infrastructure wich assigns credibility to sources and so on...

        Another stumbling block is common ontologies, i.e. how do we know we are talking about the same thing, th

    • This could create huge problems for people to stay on the right side of copyright law. A medium that pulls information from several different sources could potentially make it much harder to avoid copyright infringement. For example, you pull from a Wikipedia entry, a NY Times entry and a Reason editorial. You better keep track of where you got each part if you use them in any of your own research, commentary, etc.

      How does it combine information from different sources in a way that keeps the user knowledge
    • And I get 200 adds for herbal viagra, 300 nigerians that have inherited 15 MILLON USDOLLARS, and deviant pornography.

      A semantic web is only as useful as the metadata, and people go to great lengths to mislead and disguise.

      • Right, we need some sort of trust mechanism much better than "how many people are linking to this page". It's pretty easy to see how gameable Google is - there's no reason people won't try to game semantic content to push their products and services as well.

        In fact, it seems like the trust problem isn't that different at all, perhaps the only real difference is that with the WWW, you get to look at every page yourself and make the judgment call, "does this look like a scammer, are there lots of blink tag

    • Anybody remember the demise of META keywords?

      I think we could run into the same problem with the Semantic Web, as it too allows web developers to attach arbitrary metadata to their pages. The only way to prevent unscrupulous web developers from embedding inaccurate RDF in their pages in hopes of attracting more hits is by establishing a web-of-trust framework.
      Google implements a very crude version of web-of-trust that assumes "incoming hyperlinks==trust". I think that in order for the Semantic Web to be
  • Something similar. (Score:4, Informative)

    by modifried ( 605582 ) on Saturday October 09, 2004 @10:30AM (#10479259) Homepage
    Covered not long ago [slashdot.org] - an interview with Berners-Lee regarding the Semantic Web [technologyreview.com].
  • Why is this news? (Score:4, Informative)

    by multipart ( 732754 ) on Saturday October 09, 2004 @10:33AM (#10479281)
    People at DERI in Ireland's Galway are also working on the Semantic Web (see http://www.deri.ie/). I thought lots of people are...
    • They are - there are several major European consortia, many involving the University of Sheffield where I work on Semantic Web Services, as well as lots of US work especially deriving from DARPA and CMU work on agents...
      • Ditto the University of Southampton [soton.ac.uk]. I've been working on a SW-related project, AKT [aktors.org], for the last four years; as part of this work, I was a member of the W3C working group (along with Jeff Heflin) that wrote the OWL Web Ontology Language.

        Other places to look at are Jim Hendler's MIND group [umd.edu] in Maryland, which has been doing some sterling work over the last few years (as an aside, Jeff used to be Jim's PhD student).

  • by NoTheory ( 580275 ) on Saturday October 09, 2004 @10:33AM (#10479284)
    I'll have to rtfa to see what they propose, but just the principle of resolving contradictions is a really difficult one, and most theories of knowledge (which are essentially networks of facts) aren't terribly robust, and contradiction repair, which involves running the entire network to find invalid assumptions, and then propigating the changes is NP complete :| i'm not positive that contradiction resolution is a reasonable thing to expect out of a massive distributed network.
    • by mfh ( 56 )
      I'll have to rtfa to see what they propose, but just the principle of resolving contradictions is a really difficult one

      Yeah, this is really easy. Just look next to the title and see what score the moderators have assigned and you get a sense of whether there be contradictions! Generally if the score is lower than 1, there could be contradictions so:

      if($score < 1){$contradiction_level++;}elseif($score >=3){$contradiction_level--;}

      Yeah it's really difficult.

    • The Semantic Web's use to resolve contradictions is probably least applied, at least in these early stages. Also, it is not meant to be a global information store (in which all contradictions may be resolved). It is meant to be large numbers of globally connected information stores, and between small numbers of these contradictions may be resolved.

      Also, the ontology of the semantic web comes in 3 flavors, OWL Lite, OWL DL, and OWL Full. The first two are limited enough that they are decidable (I'm not su
      • Any system that is used in the real world and cannot at its core handle inconsistent information (and no deductive system can) is fundamentally flawed for this use. Expect from the semantic web the same as for automatic translation (60s) and expert systems (80s).
        • The World Wide Web cannot "at its core handle inconsistent information" yet it seems to lurch along okay. The Semantic Web is not some attempt at global knowledge, perfect knowledge, perfect reasoning, or anything of the sort, regardless of what many posters, including yourself, seem to have construed it as. It is intended to be an analogue of the World Wide Web, which is primarily consumed by humans, that is instead primarily consumed by computers. Can it know everything? Of course not! But it can make
          • Silly me, not previewing.

            The World Wide Web cannot "at its core handle inconsistent information" yet it seems to lurch along okay.

            The Semantic Web is not some attempt at global knowledge, perfect knowledge, perfect reasoning, or anything of the sort, regardless of what many posters, including yourself, seem to have construed it as.

            It is intended to be an analogue of the World Wide Web, which is primarily consumed by humans, that is instead primarily consumed by computers.

            Can it know everything? Of cours
            • Thanks for the explanation, your story makes a bit more sense than the article(s) I read about ontologies, distributed ontologies, merging distributed ontologies and research into merging mutually contradicting distributed ontologies, which seems to be linked together with a brittle inference process that will fall flat on its face when inconsistent information is entered. You guessed, I remain sceptical.

              About the web, it's perfectly capable of handling inconsistent information, as all the information it

              • I think you'll find that most of the sensationalized research that gets put out on the Semantic Web is not in fact most of the research that goes on :) (for instance, we hear all about the big expert systems research still going on, even, but remarkably little about the often very successful, if less grand in scope, machine learning research).

                Most of the practical ontology research focuses on internal ontologies in data stores, so that an RDF store can return "more" information than was put in. Only inform
                • Just as a bit of context, I'm a researcher in machine learning that's extremely jealous of the amounts of funding my collegues just received for their good-all-fashioned inference technology on the semantic web (They did a great job getting the money, but I think they're on the wrong path).

                  But okay. I don't think we're in disagreement here: I totally agree with you that the "low level tools" are the ones that are of interest here, I'm just sceptical about the scalability of the grand objective: crisp infe

                  • by ngibbins ( 88512 )

                    I agree with you completely on this point. The most important advances that have been made in the knowledge engineering community over the last decade have been those that have tried to fuse non-symbolic and machine learning techniques with the good old-fashioned AI of expert systems.

        • The knowledge engineering community has moved on since the expert systems of the 1980s, and techniques for handling uncertainty and inconsistency are now commonplace. The SW draws heavily on this experience.

    • alright, having read the friggin' article, all i have to say is that they have their work cut out for them.

      the problem with searching currently is that only librarians, who've had at least a year or two of graduate studies really know the ontology that libraries use. Common users bring their own concepts and ontologies to bear when they're searching for information. But if you move away from the monolithic single ontologies that libraries use, you have the problem that you have to be open to the fact tha

      • "the ironic thought that pops to mind is that if you've got a set of universal descriptors, then don't you already have an ontology? And if you don't have a set of universal descriptors, how would you ever create a coherent ontology?"

        There's nothing particularly ironic about it. The question you're asking exposes a fairly common misunderstanding of what the Semantic Web's all about. Several years ago, I attended the talk by Tim Berners Lee in which he announced the principles of the Semantic Web. As I rec

      • With this many nay-sayers in the world, I'm surprised anything new is ever achieved.
    • The Semantic Web as envisaged by the W3C is based on the RDF and OWL languages; the latter has a Description Logic as its underlying formalism, which is a subset of first order predicate logic with computationally attractive properties that lead to tractable decision procedures for satisfiability.

      Distribution is a separate issue. While assembling the parts of a distributed ontology may be expensive, it doesn't affect the algorithmic complexity of determining whether a set of axioms contain a contradiction.

  • Snake oil... (Score:3, Insightful)

    by Alomex ( 148003 ) on Saturday October 09, 2004 @10:36AM (#10479305) Homepage


    with their favourite mode of publication being the press release.

  • by murderlegendre ( 776042 ) on Saturday October 09, 2004 @10:47AM (#10479368)

    ..from user's queries.

    Clippy..? Is that you?

    • Clippy..? Is that you?

      "it looks like you are searching for pr0n. would you like me to lock the door and dim the lights via X10? how about some romantic music? maybe add to your search results a froogle side-bar with the best astro-glide prices on the net?"

  • by Eloquence ( 144160 ) on Saturday October 09, 2004 @10:47AM (#10479372)
    Who is "building the semantic web"? Academics or web authors? The only semantic web technology that has actually gained wide usage in the sphere of user-generated content is RSS, a syndication format (or rather, a bunch of competing syndication formats). The reason for this is that weblog engines like Slash and Movable Type support syndication. This then allowed programmers to create news aggregators and filters.

    The same can be said about any semantic web technology - whether it's FOAF [foaf-project.org] (an RDF vocab for describing people and their interests) or a vocabulary for reviews [ideagraph.net]. As soon as major authoring tools (i.e. both web editors and content management systems) start integrating these technologies, people will use them if they are useful. Do not expect web designers or bloggers to have a clue about all the great things that the semantic web can do - give them one useful thing which they understand, package it in a pretty UI, and they will start using it.

  • I guess that the Semantic Web would need HTML documents to meet strict requirements when it comes to validation, use of logical instead of physical markup and so on. This could be an incentive for people to use HTML the way it was intended, instead of the crapload of pages that don't close tags, use hundreds of redundant FONT tags, use the H1..H6 elements to control font size instead of using them to indicate headings, and so on. Strangely enough, all "beginner's" HTML books still teach people to code this
    • I'm not sure how I feel about that. On the one hand, moving to a pure data-in-HTML, presentation-in-CSS model works wonders for helping machines decipher the mess. On the other hand, when I see things like

      <font size=+1><font color=red><font face=verdana><font color=blue><font face=arial><font size=+2><font size=+1>Welcome to Example.com! (best viewed in Internet Explorer at 800x600 with at least 256 colors)</font></font></font></font></

    • I agree completely - most of the beginner HTML books I've read seemed bent on teaching that content and layout go together, exactly the opposite of what the W3 advocates. Luckily there are a few beginner books [peachpit.com] that teach HTML and CSS side-by-side, but, as an instructor, I'd like to see this approach adopted by all instead of a few.

      The Semantic Web sounds great, but I really don't trust people creating websites to include pertinent and accurate metadata about their site. If someone creates a site and simpl

      • "The Semantic Web sounds great, but I really don't trust people creating websites to include pertinent and accurate metadata about their site."

        You're quite right to suspect that the Semantic Web won't start in the blogs of the world. It doesn't scratch any particular itch for individual web authors.

        But consider its value for a business that works with dozens (or hundreds, or thousands) of large clients, all of whom submit their data in more or less arbitrary formats. There is huge value for them in stan

    • Well, surely good HTML could have helped in providing more semantics. For example, a table that had "price" in a TH could make it easier to guess that the numbers in the TDs associated where, well, prices for a product.

      However, HTML is not so relevant in the Semantic Web. There are many reasons for this, but I guess one is that it is expected to never get beyond tagsoup... Well, I dunno...

      It is RDF [w3.org] that is at the core of the Semantic Web. Funny, I have been interested in RDF for six years, still I have

  • by The_reformant ( 777653 ) on Saturday October 09, 2004 @10:50AM (#10479395)
    The semantic web is a pretty popular area of research right now and its far from being "built by computer scientists at Lehigh University", in fact I could have done an undergrad dissertation on the semantic web, and there were numerous phD positions being advertised at uni's around the world researching about the semantic web.
    Whichever lehigh uni professor submitted this is stooping pretty low trying to raise publicity (and hence finance) I would think!
    • A lot of people from a lot of universities are probably working on the same idea. There just happens to be an article about the professors at Lehigh.

      I took some classes from Professor Heflin; he's a very bright guy. As for the semantic web, I don't think it will catch on. When you write your web pages you have to follow a strict schema and add all this metadata to each page for it be 'correct'. Most users could give two shits about this metadata and you'll still have chaos in the web.
  • by SuperBanana ( 662181 ) on Saturday October 09, 2004 @10:51AM (#10479398)

    Am I the only one who recognized the main graphic for the story as a lifted screencap from the movie Hackers? That movie's SOLE redeeming quality was Angelina Jolie...

    Well, ok, that and the laugh factor. Not quite as much fun as MST3K'ing The Mummy with about a half dozen friends though.

  • I have had RDF on my web site for years, but last year as an experiment, I started a web spider running that specifically looked for RDF - I found very little.

    I even cheated and specified the 'seed' starting web sites as sites that I knew to use RDF.
    • Re:too little RDF (Score:3, Informative)

      by Tony Hoyle ( 11698 )
      There's absolutely loads of it around... especially as people are starting to use more generated websites (like slashdot for example).

      If you search for *.rdf maybe you won't find as much... a lot of it is *.rss, *.xml and other things.

      Also, google doesn't index them.
  • by Mazzaroth ( 519229 ) on Saturday October 09, 2004 @10:55AM (#10479435) Homepage
    Semantic web is an amazing adea that will profoundly transform the way we interact with information. But I can see huge amount of work remaining to be done:
    • We need an ontology that will cover many if not all aspect of human experience. And this experience has been evolving dramatically and will continue to evolve. This ontology is probably a moving target. This task alone of creating the ontology has been, and is still the holy grail of AI and Knowledge Management.
    • The amount of time we will have to invest in adding metadata to the data will dramatically increase over time. We will need a way to automate the filling of the metadata layer. This is where kicks in automatic image recognition and classification, speech to text, text summarizer and meaning extractor (Here, Copernic [copernic.com] is is the right direction). Maybe the librarian profession will be the next hot job...
    • Almost every application will have to adapt and inter-communicate. No big deal, RDF [w3.org] will probably become the new data bus anyway.
    That will be interesting!!!
    • We need an ontology that will cover many if not all aspect of human experience.

      One of the advantages of the Ontology as a model is that we can avoid needing a 'global' one, instead we can compose ontologies and translate between them to create the semantic viewpoint.

      The amount of time we will have to invest in adding metadata to the data will dramatically increase over time

      There are additional issues, such as 'faithless' annotation (liars and miscreants) as well as genuine errors (human or other). Tag
  • I still don't know why this feature isnt used to make the web powerful for offering more links on the same web page:
    On the same page, a level of links should be increasable/decreasable. The default one would be the one we see currently on all the web sites.
    When going to the next level, the page would not reload at all but the browser would just show the links at different places on the page. These links would have been setted by the webmaster on ideas that require linking a sentence or a part of it, not jus
  • in other words... (Score:1, Interesting)

    by Anonymous Coward
    ...handle more data, resolve contradictions and draw inferences from users' queries. The new improved Web will also combine pieces of information from multiple sites in order to find answers to questions.

    It will essentially be a librarian?

    The problem with this is that users first need to know what the heck they're actually looking for. You can draw as many inferences as you like, but so long as people search for "art" when they're interested in "tattoos" you aren't going to get much that's relevant. An
  • Welcome to 2001 (Score:2, Insightful)

    by the_demiurge ( 26115 )
    Hasn't everyone heard of this already?
    W3C semantic web activity from 2001 [w3.org].
    Heflin's Thesis [lehigh.edu] from 2001.

    I'm rather skeptical of the whole thing, it seems to me to be like "Wouldn't it be nice if people documented their web page content better? Then we could do all these neat things." The second statement is right, but I fear the first statment is intractable.
  • by ngunton ( 460215 ) on Saturday October 09, 2004 @11:20AM (#10479575) Homepage
    It seems to be a common mistake for computer scientists to think that it's possible to make systems that "understand" the world (both real and abstract knowledge), with all its complexity and ambiguity, in the same way that humans do. I feel that there is a fundamental difference between using computers to enable humans to organize stuff, and having computers automatically do it. Every single attempt at getting computers to be "smart" about infering human intentions has ended up as an irritating impediment to using the system - look at clippy, Bob, "intelligent" voice systems that try to "help" you by stopping you from talking to a real person... what computers are very, very good at is amplifying and enabling human intelligence. Computers are not themselves intelligent, and (my personal opinion) I don't think they ever will be - unless we manage to "grow" them using processes that we probably won't fully understand. You can't construct something that is as complex as the human mind through deterministic (i.e. consciously designed architectural) means - all you'll end up with, at best, is a very complex rule inference engine that is limited by the rules you gave it. Every "holy grail" of intelligent programming that has come along - neural nets, genetic programming etc - has turned out to be very limited (though very useful in special situations).

    I also feel that talking about automatically organizing the world's knowledge in a semantic web is just more of the same hot air that we've been hearing from AI departments for the last few decades. You can't automatically allocate meaning to something unless you have the capability for "common sense" reasoning, and the world knowledge at your fingertips to be able to interpret the data intelligently, like a human would. And even then, different humans would interpret it differently... so there are multiple meanings, and anyway, how to allocate "meaning" to something abstract such as a poem or piece of art?

    And if we require real people to add metadata to everything... well, it just ain't going to happen, in my humble opinion. Adding meta data is a pain in the ass, since you have to define the categories of object, agree on meanings for all the different taxonomies that will have to be used to describe the world... then there's the potential for abuse, as spammers will inevitably seed their documents with inappropriate metadata. So, the "honest" people can't be bothered, and the dishonest people will wreck anything that does get built. So, it ain't gonna happen.

    The beauty of google (not that I love google, but they did hit a nail on the head) is that it requires no effort or "machine intelligence", beyond a very simple algorithm that depends not on AI but rather real, tangible relationships between words and documents (proximity and links). This is something that computers can be really good at.

    Just my opinion... obviously there will be others out there who will vehemently disagree, and that's fine! Go ahead and try, you'll learn a lot in the process and you will probably come out with some tangential technology that you never thought of initially but is useful nonetheless.
    • by DrEasy ( 559739 )

      The beauty of google (not that I love google, but they did hit a nail on the head) is that it requires no effort or "machine intelligence", beyond a very simple algorithm that depends not on AI but rather real, tangible relationships between words and documents (proximity and links). This is something that computers can be really good at.

      And that's the curse of AI right there. Because you happen to know the algorithm underneat Google, you don't think of it as "intelligent". But to the average Joe it can c

      • I guess it depends on how you define "intelligence". Chess is a very closed system that can be defined very precisely by rules - a great application for a powerful computer that can simply go down all the game paths (possibly using some predefined heuristics) and find the best solutions. Also, remember that the latest chess supercomputers have been "trained" with the best games from the past (human) grandmasters. So I don't really see a computer playing chess as being intelligent, unless you define differen
        • So I guess our difference in opinion comes from the fact that to me, a human being can also be thought of as a complicated machine. Intelligence is always an "illusion", as long as the machinery behind it is undiscovered.

          As computers solve more and more difficult problems (beating humans at chess, learning to filter spam, semi-autonomously exploring space...) we become blase with our achievements and our ambitions raise. But let's not forget that when those problems were defined, we said that if a computer
    • Some time ago, on TV, I was watching a show on intelligent computers. They highlighted one project in which a program designed to learn word associations was left alone for a long time with huge quantities of texts. When the researchers came back, the computer had made associations like "father" and "president". THey looked into its associations to see how it came to this conclusion, and it made sense. I don't remember the exact connections it used, but in many ways, the president is like a transient fa
      • Let me summarize:

        the computer had made associations like "father" and "president" [...] So, I do not doubt computers will begin to understand meaning

        A few questions about this inference: How is what the computer did anything more than a simple correlation? Could the computer tell you what the scope and limits of the association are (i.e., in what ways a president is like and unlike a father)? Could it parse/create a sentence in which it had to determine whether to use either of these words in a metap

    • ... unless we manage to "grow" them using processes that we probably won't fully understand ...

      My hypothesis there is that time for data/information/knowledge to settle into kind of a state of integration with retrieval/inference engines/mechanisms is a crucial factor mostly not taken enough care of (IMHO). Support is given by the fact that the socialization period of humans is so much longer than for all the re of mammals.

      I also feel that talking about automatically organizing the world's knowledge i
  • That's a language that can be parsed by computer! Rippin'. But I figure we'll just wait for the singularity before anything really changes, after which we'll use binary code or something. In the meantime it's English, which will be looked at as civilizations greatest joke at some point in the future. It will make it quite hard to make this semantic web.
  • by Doc Ruby ( 173196 ) on Saturday October 09, 2004 @11:23AM (#10479590) Homepage Journal
    Meaning is always "in context". Human communication always requires a "transmitter -> medium -> receiver" structure. Some say the universe is fundamentally structured on that model. When these sematic systems are overlaid on content, there's always these slippery, unresolvable mismatches of "intent" and "understanding", those "semantic arguments" that drive likeminded people crazy. Content searching is extremely powerful, without creating the "cracks" into which meanings can irretrievably fall. As long as there are alternative semantic indices to content still available "raw", semantics will just help. When we move to wrap all content entirely in semantics, we'll live in the "map is not the territory" problem forever. Ask CORBA programmers and EU language translators about the death of meaning by means of the dictionary. If we need to add semantics as a tool, we still get under the hood at the actual content.
  • I thought it was a web of money :-)

    Crispin

  • I can't wait for the spam people and porn sites to get a hold of semantic web technology.

    The meaning of is V1.agra and C011.3G3 GIRLZ!
  • Surely this just piles on more work for us poor poor developers....
  • by kubalaa ( 47998 ) on Saturday October 09, 2004 @12:05PM (#10479805) Homepage
    Semantic Web is the most ridiculous idea I've ever heard. The problem with meaning isn't representation -- English represents meaning just fine. The problem is meaning itself -- it doesn't matter if you figure out a way to encode it in some XML language, for every bit that it's easier for computers to use, it will carry that much less meaning.

    Another way of putting it is, any program capable of extracting the same meaning from XML that humans can, should be able to understand English without much trouble. It's the whole Intelligence-complete" thing. Like NP-complete, there seem to be a class of problems which can only be solved by real intelligence, and they're all pretty much equivalent in that with real intelligence, you can solve them all.
    • That's the whole point though, English is extremely poor at representing meaning, and semantic annotation is intended to give keywords for more sensible reasoning.

      English is very poor, it's somewhat possible to get effective searching from something like google from the structure of the document and its content, but a better annotation will permit more accurate and more complete retrieval, as well as retrieval based on non-obvious features.
    • This got insightful?!

      Lets take a look at English, shall we?

      "Milk costs five dollars."

      "Milk always costs five dollars."

      "Milk's price is five dollars."

      "Isn't it cool that milk costs that low, low price of five dollars?"

      "I am so gosh-darn happy that I can obtain the glorious bounty of milk for a mere five (count 'em, one-two-three-four-five) bills featuring our esteemed former president, George Washington."

      Now, lets take a look at some possible semantic web statements.

      Milk hasPrice $5

      anonymousItem has
    • You're right english doesn't work, but that doesn't mean meaning cannot be encoded. The English language lacks context. Provide a language based on context and you can encode meaning. Context is nonlinear ans so should the language be.
  • Great. An Expert System to do your google searches based on what it thinks you meant. The giant Semantic 'Clippy' knows what's best when it pops up to say:

    ''Here are the results to the question you should have asked.''

    Maybe next they'll have the Semantic Web manage the way electronic voting is counted. Semantic Clippy will count your 'intent' instead of your actual vote.
  • The meaning of the Internet, eh?

    That should yield some interesting answers.

    "42. The Answer that you are looking for is 42."
    "You searched for "space ship one", but what you really want to search for is "natalie portman hot grits"."

    Isn't the whole point of the Internet a database of information which we can access using tools - not to create a "web of knowledge"?
  • why this will fail (Score:3, Informative)

    by ndunn ( 171784 ) on Saturday October 09, 2004 @12:26PM (#10479892)

    Google works because it is largely a statistical tool that uses some meta-information.

    While I could see frameworks being used for very specific purposes, like searching a homogeneous (e.g., slashdot, pubmed, nytimes) web-site where all content is controlled. But extending these ideas to a heterogenous web that would no doubt take advantages of such a volunteer system is ludicrous.

    I also take issue with the top-down mind-state that they will be able to predict what is useful to the user. This is why statistical importance and quantity is the only realistic method for such a massive undertaking (which google is still actively researching).

    I think that the only useful research to come out of such an endeavor would be to have news-sites, as mentioned above, implement and be scanned using an ontological browser. Of course, I am not sure how this would be different than Lexus-Nexus (sp?).

  • To sports car enthusiasts, football fans and wildlife specialists, the word jaguar connotes highly discrete entities.

    Apple OSX afficionados.

  • This strikes me as eerily similar to Daniel Waterhouse trying to write down lists of everything for the Royal Society in Stephenson's Quicksilver.

    The whole reason the web is popular is because it's trivially simple to create content for it. Maybe the web would be more useful if it was like a giant encyclopedia but it's just an exercise in futility unless everyone gets on board.
  • One should always be suspicious of a technology that is promoted more on the basis of the desirability of its imagined future consequences than in light of practical, present-day successes. An unanswered question for me in all this is how many man-hours it would take even to retroactively tag all the data currently available on the web with semantically rich metadata.

    Here's an analogy that doesn't prove anything but reframes the problem. As far as I understand it, the Pentagon cannot be audited, because

  • "No formal theory," Heflin wrote in his proposal to NSF, "has considered how ontologies can be integrated and how they may change, or the role of trust in integration."

    Yes it has.

    See Relation Arithmetic Revived [boundaryinstitute.org]and Structure Theory [boundaryinstitute.org]. These two papers were written as a result of Hewlett-Packard's E-Speak project's support of a continuation of work begun at Paul Allen's thinktank, Interval Research. These then led to an understanding of the importance of identity theory in performing logic with what we

  • Making the web a 'web of meaning' looks very very far if not impossible to realize. But some of the work will probably be in mean time used to enhance the current state of the web adding some meaning to it (but not enough to call it a 'web of meaning').

    It's analogous to C and Smalltalk. C++ and Java evolved, but are not as purely object-oriented as Smalltalk is.

    Either it is not a good model in its entirety or time is not right for it. (though I believe it's the former)

  • Whenever I see something about the semantic web, I go back to Clay Shirky's critique of it [shirky.com].

    A useful antidote to the hype.

  • Clay Shirky has written an excellent article about why the Semantic Web ain't gonna work [shirky.com]. I don't agree with everything he says, but it's a thought-provoking read nevertheless.


  • Given that the semantic web has been in development for years, and that the opinions(*) have long ago finished forming, I'm a little confused as to what this is doing on a news site.

    (*) Said opinions break down roughly thus:

    1% -- This is an amazing new way of percieving and connecting data that will revolutionize computing in the future.
    9% -- This is a waste of time, a clearly impossible task that would seem of interest only to a certain breed of dysfunctional academics.
    90% -- Huh?
  • Making a semantic web that is based on English or even Latin languages would be a useless addition to the chaotic structure of the current web. The problem is not the way computers use our language, rather it is the language that we are trying to use.
    The assumptions that are beneath English are difficult to work with, and in reality wrong. When I say "I am a baseball player" the meaing is quite different than that sentence protrays.
    As mentioned by another commentor, context is the most important element of
  • Semantic web...makes me think of the scene in WoO when professor is "uncurtained" and the Greatand Powerful Oz shouts something about "pay no attention to the PERSON behind the curtain" (don't remember exact wording). I mean, whose "meaning" is going to drive it? Could be cool tho'..."computer, search for meaning of prOn." WoooHooo!

Sendmail may be safely run set-user-id to root. -- Eric Allman, "Sendmail Installation Guide"

Working...