Stories
Slash Boxes
Comments

News for nerds, stuff that matters

Challenging the Ideas Behind the Semantic Web

Posted by ScuttleMonkey on Wed Jul 19, 2006 12:39 AM
from the there-isn't-any-deception-on-the-internet dept.
mytrip writes to tell us that after a recent presentation to the American Association for Artificial Intelligence (AAAI) Tim Berners-Lee was challenged by fellow Google exec Peter Norvig citing some of the many problems behind the Semantic Web. From the article: "'What I get a lot is: "Why are you against the Semantic Web?" I am not against the Semantic Web. But from Google's point of view, there are a few things you need to overcome, incompetence being the first,' Norvig said. Norvig clarified that it was not Berners-Lee or his group that he was referring to as incompetent, but the general user."
This discussion has been archived. No new comments can be posted.
Display Options Threshold:
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
  • Problems w/ the Semantic Web (Score:5, Insightful)

    by CTalkobt (81900) on Wednesday July 19 2006, @12:49AM (#15741458)
    (http://www.ctalkobt.net/)
    is the users.

    Not the ones searching but the ones creating the content.

    They'll be some idiot out there (like there is now) that will code his data in a way that guarantees that he gets the most page views etc. So often searched terms will turn up on search indexes and other ilk.

    It's a loosing proposition unless you come up with filters but then they have their own set of problems.

    • Are you just another Anti-Semanticist? by dolo724 (Score:2) Wednesday July 19 2006, @12:55AM
    • Re:Problems w/ the Semantic Web by ErikTheRed (Score:2) Wednesday July 19 2006, @12:56AM
      • 1 reply beneath your current threshold.
    • Re:Problems w/ the Semantic Web by klenwell (Score:1) Wednesday July 19 2006, @01:01AM
    • ...is the users. Not the ones searching but the ones creating the content.

      Sure, the technical limitations of Joe Public might slow the growth of the Semantic Web on the whole, but what few people realize is that the Semantic Web has already existed for years in in-house or limited-audience networks. Just look at FOAFnaut [foafnaut.org] (an update in a few weeks will return it to full usability) or the very much real-world examples in Geroimenko & Chen's Visualizing the Semantic Web [amazon.com] (Springer, 2005).

      [ Parent ]
    • A bad example: FreeDB by h_benderson (Score:1) Wednesday July 19 2006, @02:30AM
      • Re:A bad example: FreeDB (Score:5, Insightful)

        by kthejoker (931838) on Wednesday July 19 2006, @07:22AM (#15742422)
        Ugh, this is the major misconception of proper Semantic Web implementation.

        There are two user types of Semantic Web materia: the individual user and the group.

        The individual user only cares about context. It's like a Proustian adventure for him. If he tags Slashdot as "blatherscyte" because that's how he views it, then that's valid. If he tags it as "cmdrTaco" because he is stalking Rob, then that's valid, too. And if he tags it as "monkey" because one time he was petting a monkey while he viewed the site, then that's valid, too. It's like the old saying, "Whether you think you can or think you can't, you're right." There are no wrong semantics for the individual user, because it is his context alone which defines the usefulness of a tag.

        For this reason, the individual user should be allowed to tag freely and without limits, and also be able to edit or remove tags later.

        ----

        Now for the group, they have a different goal. Context does them no good, because they don't have the same context. Their goal then is consensus. Take your problem at FreeDB. The simple solution is to let people vote on the accuracy of disputed tags. Or flag ones they view as incorrect, and then review those that meet a certain threshold for flagging. Basically, you want the group to filter out things that don't apply to the group, WHILE maintaining individual context. You don't delete the tags that the group has rejected - you just hide them from the person who has come to view the group tags.

        I think this dichotomy of group vs. individual is what has gotten us into trouble with the Semantic Web. To use one example, I think delicious' big mistake was to show you "popular" tags for a given link. What that does is encourages you not to create your own tags, but instead just piggyback on popularity. Over time, this creates homogeny, which is great for the group, but not for the individual user. Sure, they can probably find that link again in a minimal amount of time, but if an individual tag might help them find it faster, but they shunned individual tags for groupthink, so much the worse for them.

        And on the flipside if you don't provide proper weighting and trust metrics into your tagging system, you are opening yourself up to not only abuse and inappropriate behavior, but also to the "incompetence" mentioned in the article, which is not so much incompetence as a zero-filter. It's like reading Slashdot at -1. It's kind of a touchy-feely way to look at it, but in Web 2.0 thinking, it's bad to delete content; just filter it out instead. It's bad to censor opinions from the software side; let each user do their own stifling. Give the users complete control over the content, and they will find models that work. It's that simple.

        The main problem with the Google guy's point is that philosophically, Google is more groupthink than individual user, because they're a search engine. They value consensus over context. In the future, perhaps they will value context a little bit more than they do. Until then, they have to stand where they stand, because they can't let context into their system. They've tried some clunky mechanisms to do so (Personal Search, anyone?) but until they get it right, the Semantic Web won't have any value to them.
        [ Parent ]
        • Re:A bad example: FreeDB by berbo (Score:1) Wednesday July 19 2006, @09:16AM
          • Re:A bad example: FreeDB (Score:4, Insightful)

            by kthejoker (931838) on Wednesday July 19 2006, @12:05PM (#15744411)
            That was the entire point of my post! The group benefits from standardization, but the individual suffers. The Semantic Web is an attempt to give power back to the individual user. Subjectivity is a crucial element of the system, and sanitized, standardized, NPOV systems deny the individual subjectivity.

            Delicious is very smart in that it left the *option* for customised tags, but they are clearly saying by implication that the best tags are the ones everyone else is using. My point being that the idea of a "standardized vocabulary" is antithetical to the ideals of the Semantic Web. We don't want a democracy of ideas; we want a free market of ideas!

            Think of the concept "funny." Let's say I asked you to go to 100 different random sites and tag them as funny or not funny. Let's say that of the sites you listed as funny, it was clear you enjoyed witty, New Yorker-style humor, and not fart jokes. But let's say 99 other people did the same thing, and they did the opposite: they clearly enjoyed the fart jokes, and hated the New Yorker wit.

            Now if you asked this seeded engine for a recommendation of a new, 101st site that was funny, should it give you fart jokes, or New Yorker style? This is the power of the Semantic Web. What's funny to you, isn't funny to everyone else. Why should you be punished for that? And if a total n00b comes to our engine for a recommendation, they get the fart jokes page, because it assumes they're like everyone else. But if they start marking those sites as not funny, eventually it'll figure out they're more like you, and start giving them sites that you like.

            Now, will delicious ever do that? Of course not, because it doesn't offer any discrimination to you on the word funny. You get the democratic version of funny. Fart Jokes for all. And that's what "standardization" has to offer. So, no, you can keep that; I want the Internet to understand who I am, and what I like, not what everyone else likes. And if they HAPPEN to coincide, that's fine, so much the better - things are popular because of the people, after all - but they shouldn't have to.
            [ Parent ]
        • Re:A bad example: FreeDB by coolGuyZak (Score:2) Wednesday July 19 2006, @10:47AM
    • Re:Problems w/ the Semantic Web by JiveDog (Score:1) Wednesday July 19 2006, @05:22AM
    • Re:Problems w/ the Semantic Web by dbc001 (Score:3) Wednesday July 19 2006, @09:01AM
  • Semantics... (Score:5, Funny)

    by Thakandar2 (260848) on Wednesday July 19 2006, @12:52AM (#15741464)
    "Norvig clarified that it was not Berners-Lee or his group that he was referring to as incompetent, but the general user."

    Here I was, thinking we were arguing over Semantics...
  • Damn (Score:5, Funny)

    by ErikTheRed (162431) on Wednesday July 19 2006, @12:53AM (#15741465)
    (http://www.renaughty.com/)
    "...Norvig clarified that it was not Berners-Lee or his group that he was referring to as incompetent, but the general user."
    Because Norvig vs. Berners-Lee going 10 rounds in a cage is something I'd pay to see.
  • by UR30 (603039) on Wednesday July 19 2006, @12:56AM (#15741473)
    (http://radio.weblogs.com/0112083/)
    The current semantic web seems to offer a technology too fragile to use on the global scale. The complexity of various classification and ontological schemes, work needed to provide the metadata etc. Also, semantic web seems to offer great opporturnities for spammers and other mischief makers. Now we already have comment and reference spamming, but semantic web (on the global scale) raises the possibilities enormously.
    • by znu (31198) <znu@acedsl.com> on Wednesday July 19 2006, @01:36AM (#15741550)
      The full semantic web scheme really ignores a lot of what the Internet has taught us about what technologies succeed. It's not about grand visions and long specifications, it's about simple stuff that solves real problems of limited scope. Look at RSS, for instance; it's about the simplest thing which could do the job it does.

      I think we'll eventually realize most of the benefits of the semantic web, but it won't be a result of a grand vision imposed from the top down and implemented all at once. It'll probably be though increasing adoption of microformats [microformats.org], which don't try to classify and specify everything, and are implemented entirely using existing web standards.
      [ Parent ]
      • Re:Semantic web is currently fragile technology by mgblst (Score:2) Wednesday July 19 2006, @02:40AM
      • Re:Semantic web is currently fragile technology by Crayon Kid (Score:3) Wednesday July 19 2006, @05:12AM
        • by Bogtha (906264) on Wednesday July 19 2006, @05:52AM (#15742161)

          Tim Berners-Lee seems to stress the fact that the semantic Web is all about AI doing content classification for us.

          I don't think I've seen him stress that in the sense that the users are dissassociated from the process. The Semantic Web is all about representing things like tags, microformats, etc, in a generic way.

          For example, if comment moderation was defined in terms of a relationship between a person, a comment, and an opinion, that doesn't mean a computer would be moderating comments, it just means that the same mechanism could be applied across multiple websites, without having to build moderation into the websites themselves. You could mod Dvorak -1, Troll, and everybody who lists you in their FOAF file using a browser that supports it, would see that moderation.

          Just because the focus is on making the software smarter, it doesn't mean that it's about replacing user opinions with computer opinions. In fact, the majority of Semantic Web stuff I've seen have been all about codifying user opinions to make them more accessible to computers, and thus, more easily exposable to the end-user in a useful way.

          [ Parent ]
          • Hear, hear! by TuringTest (Score:2) Wednesday July 19 2006, @10:59AM
    • Complex? Opportunities for spammer? Don't think so by CaptSolo (Score:3) Wednesday July 19 2006, @06:38AM
  • Googlebombing (Score:5, Insightful)

    The biggest problem with the semantic web is spam. If you can trust the tags, it's a beautiful idea. If you can't, it's worse than useless - it's a waste of time. Google has the right idea, automatic extraction of semantics from content. If there's no real content, then (hopefully) that will be reflected in the semantic analysis.

    Me, I estimate we're 5-10 years away from doing anything terribly useful with all of this stuff, but I can definitely envision the day when an internet without semantics seems as distant as an internet without Google.
    • Re:Googlebombing (Score:5, Insightful)

      by Wastl (809) on Wednesday July 19 2006, @01:28AM (#15741537)
      (http://www.wastl.net/)

      The "Semantic Web" is not about search engines, as you and many other posters seem to believe. It is about representing Web content in a structured, formal way that is more easily accessed by machines, going beyond simple presentation. This can be used for searching, but also for many other applications, e.g. integration, exchange, personalisation, ... .

      Spam content on the Semantic Web is in no way different to spam content on the normal Web (well, except that it is formal). This also means that a search engine that is capable of working with Semantic Web data has exactly the same issues with trust as traditional search engines. Except that on the Semantic Web, trust can be expressed formally as well. Similar to the authorities in Google, whose outgoing links make a statement about the trustworthiness of other sites, an "authority" on the Semantic Web can make statements about the trustworthiness of other sites. However, these statements are explicit, and they could also be used to state that another site is *not* trustworthy.

      Google has the right idea, automatic extraction of semantics from content.

      Google does not extract any semantics from content. It merely analyses the linking between websites and connects that with keywords. No semantics here.

      Sebastian

      [ Parent ]
    • Re:Googlebombing by russellh (Score:2) Wednesday July 19 2006, @02:04AM
    • Re:Googlebombing by radtea (Score:3) Wednesday July 19 2006, @09:44AM
  • by rsidd (6328) on Wednesday July 19 2006, @01:14AM (#15741502)
    Thanks for the illustration of what Norvig meant. How is "Google Director of Search and AAAI Fellow Peter Norvig" (original article) semantically equivalent to "fellow Google exec" (Slashdot summary)? The latter suggests that Tim Berners-Lee too is a Google exec, which would be news to him.
  • Filtered semantic webs might work (Score:2, Insightful)

    by dr_pump95 (869367) on Wednesday July 19 2006, @01:18AM (#15741512)
    Semantic webs (emphasis on plural) produced by editors such as those at /. or in the consumer-rated style of Digg, Del.icio.us etc might actually work. Trusting authors to do it right is a disaster, as Norvig suggests.
  • Always bet on the million monkeys (Score:5, Insightful)

    by IvyMike (178408) on Wednesday July 19 2006, @01:26AM (#15741528)
    It's really, really difficult to get people to follow rules. We're lazy, we're incompetent (yes), and some of us are evil. I still don't think I truly understand how RDF is supposed to work exactly, and it doesn't even seem like it will be fun to try.

    On the other hand, it's really easy to release a million monkeys and let the create what they will. It's not so easy to sort through what they end up producing, but Google does a surprisingly good job of this.

    It reminds me of the early days of the Web, when companies like CompuServe and AOL wanted to design and own all content. On the other hand, an internet server with httpd let anybody make a ~/public_html directory and put up whatever they wanted to. The million monkeys won that battle. I think they'll win this one, too.
  • Blaming the user is never right (Score:4, Insightful)

    by robolemon (575275) <nertzy@NOsPam.gmail.com> on Wednesday July 19 2006, @01:30AM (#15741540)
    (http://blog.nertzy.com/)

    From http://www.7nights.com/asterisk/archive/2004/03/do nt-blame-the-users [7nights.com]

    Blaming the users for anything should raise a huge red flag that you've got some usability problems.

    Maybe the Semantic Web should aim to be useful to people rather than require people to be useful to it. There has to be a better way than trying to educate droves of people to a problematic and vulnerable design.

  • Web of Trust (Score:5, Interesting)

    by VDM (231643) on Wednesday July 19 2006, @01:40AM (#15741558)
    (http://www.dellamea.it/)
    In one of the very first papers [w3.org] mentioning the Semantic Web, some paragraph was devoted to something then lost in the hype around the semantic web: the Web of trust, which had to be something like a certification of metadata. This is perhaps to be again regarded as important for the semantic web and the web in general (although not easy to manage).
    By the way, Norvig is not only a Google exec, but also a well known AI researcher, author of one of most important books [berkeley.edu] on that subject.
  • Norvig's personal project (Score:5, Interesting)

    by tfinniga (555989) on Wednesday July 19 2006, @01:43AM (#15741563)
    Slightly offtopic. Peter Norvig gave a talk at my university on similar topics, and there was a short Q&A afterwards.

    One of the students asked him what he did for his 20% project. He said that he was usually too busy keeping tabs on what the other employees were doing with their 20% time, so he didn't quite get around to working on his. He told us what he wanted to do, as motivation for himself.

    The basic idea is that when he used to work for NASA, it'd always make him upset when people saw faces in random spots on the moon's terrain, and claimed it was aliens that NASA was covering up, or similar. So, he was planning on taking facial recognition software and running it on all of google earth. I think it'd be pretty awesome..
    Any progress yet, Mr. Norvig? I'd love to see the results.. :)
  • That:s it gentlemen (Score:2, Funny)

    by Anonymous Coward on Wednesday July 19 2006, @01:50AM (#15741575)
    The jig is up!
  • I've been complaining about this and related issues for a while now. my last journal [slashdot.org]
  • not jsut the general users (Score:3, Interesting)

    by Mofaluna (949237) on Wednesday July 19 2006, @01:59AM (#15741585)
    It's the business users too that are a problem. I'm currently trying to get a project on the rails based on semantic web technology, and I'm confronted with an IT department where some are even struggling with the difference between subtyping and instantiation- let alone more advanced modelling issues... It doesnt help ofcourse that most people never even heard of conceptual modelling languages such as ORM [orm.net] but instead were thought to use uml and ER where it's the modellers' responsibility to make a distinction between what is conceptual, logical and physical which ofcourse most never did.

    In regards to the google issue I think the idea that you should crawl everything is faulty cause you need to be able to trust the source. Most ontologies will simply be restricted to a certain domain and corresponding user group, often in a b2b context. Integrating every man and his dog, the lawnmower and the kitchen sink with some kind of top level ontology is merely a nice-to-have philosophical issue that I dont expect to be solved in the near future, if only cause we havent seen much advances since Aristole started toying around with the idea. In other words, at google they are worried about an issue that's atleast a decade away from now, probably even more.
  • Hmph... (Score:5, Funny)

    by Jello B. (950817) <jellobmello.gmail@com> on Wednesday July 19 2006, @02:06AM (#15741596)
    That anti-semantic bastard...
    • Re:Hmph... by BRSQUIRRL (Score:2) Wednesday July 19 2006, @09:37AM
  • Sem Web, meet Chicken & Egg (Score:4, Informative)

    by AlXtreme (223728) on Wednesday July 19 2006, @02:07AM (#15741598)
    (http://www.aperte.nl/ | Last Journal: Monday July 07 2003, @05:11AM)
    The semantic web is, in my eyes, a typical chicken & egg problem. You've got loads of content on one side, yet current search engines work well enough to not worry about representing that content in a structured way in a markup language like OWL. On the other side, you've got embarassingly few semantic web applications that use structured content. How is a typical web developer going to justify structuring the content on his side if he can't point to an example how it could improve shareholder value? What would exporting our databases in OWL currently solve?

    True, the web had a similar problem, however creating a webpage is a lot more interesting (you see the results directly, how terrible they might be you do see a result) than structuring data. The latter takes a lot more work, and the direct benefit just isn't there.

    Sem-Web-like standards like RSS, XML and SOAP have become mainstream, but primarily because they fill a gap. The adoption of RDF or OWL simply doesn't solve anything. Yet. It would be cool to let agents loose onto the semantic web and retrieve them together with a summary on a certain subject using a multitude of sources, but as long as it's easier to Google I don't think it would generate any interest outside academia.

    Feel free to prove me wrong though.

  • I See Value in the Semantic Web (Score:2, Insightful)

    by n is prime (989698) on Wednesday July 19 2006, @03:08AM (#15741728)
    Even if we are inherently lazy, and even though some people seem to be generally against the idea, it doesn't make any sense to me not to employ this and experiment with it. Norvig is an AI guru, and his ideas on the Semantic Web may be interesting, but Google is not against the idea. Google's GData looks to me like a primitive Semantic Web. Even if only 10% of web masters adopt the system, querying to find a set of results that have been tagged as certain meta-data can come up with some interesting results. If the results are interesting enough, more webpages will include meta data tags. Also, being inherently lazy argues for not spending time writing tags all over your code, so why would anyone take the time to sabotage the system. While I understand the difficulties of the spamming problem, there are plenty of cookies on the internet anyway. I think the same inherent problem in the Semantic Web exists with PageRank. In PageRank what happens is a web page will say the same words over and over to acheive a higher ranking in the semantic analysis of the page, and thus the page will be a top result when entering a query with related words. But I think PageRank works pretty well overall. Google's next step with PageRank is to filter all the spam sites that just say the same words. Security in the Semantic Web would also be to filter those sites with obviously spammy RDF or OWL tagging. Overall the Semantic Web is a cool project that could lead to really smart searches, with axioms involving how different meta-tags are related to each other. I'm in favor of the new technology.
  • by Anonymous Coward on Wednesday July 19 2006, @03:09AM (#15741733)
    The idea of RDF is applicable to much more than public innerweb content. I've spent the last 7 months researching and developing an RDF backed system for my company's core products. Everyone should think of the value of RDF beyond the scope of trust, and then it becomes easy to realise methods of simple non-web implementation. We can all spend the next 5 years pondering how we're going to figure trusted content providers for RDF web data, or we can just start developing apps for sources which understand themselves as trusted (ie. data input from an individual, employees of a company, and any group where the individual must be accountable for their actions). Whats more important than the blind trust of sources, is data verfication. There are ways to run data input from one user by another user, without doing it in an infringing, demanding way, for validation. I'd like to go into detail of exactly what I mean by all this, but I don't want to violate any portion of my NDA or tip off industry competition (I know that sounds retarded, sorry). If RDF does gain popularity, I can say it will from within the private sector, not the public. Genious implementation may bring RDF to the public sector, but thats not something I would say is guaranteed to happen.

    Current technical obstacles to creating any RDF applcation: The matter of complexity of its integration into DB backed systems (popular methods), and instatiated class marshaling within not-so-object oriented languages. The technical design and implementation of a standards compliant RDF system has been extremely difficult for me. I don't think it would ever be possible to get RDF data represented nearly as minimally as you could with simple relational tables (although formally no more bloated than bloaty XML). RDF also creates many long linked relationships; this tends to create some serious performance issues in querying the data. Lastly, I hate XML, and you can't always correctly export from RDF to XML (capable type to incapable type) in a correct manner.
  • Semantic knigth (Score:4, Funny)

    by Anonymous Coward on Wednesday July 19 2006, @03:20AM (#15741769)
    This remind me of the famous Semantic knigth [xml.org] parody...
  • Threat to google's business model? (Score:2, Interesting)

    by Anonymous Coward on Wednesday July 19 2006, @03:28AM (#15741789)
    Do not forget that the semantic Web is not a replacement of the existing technologies: HTML contents will always be there but, What if these little 'metadata' description where added to ALL the Web Pages? In this case, the pages could be categorized, analysed and searched much more easily, and the algorithms related to these operations would be better. In such an scenario, the use or one or another Web search engine would be irrelevant because all of them would have powerful and acurate algorithms. Maybe a threat to google's business model? These would be the perfect world, but we have to assume that Webs would certainly lie or made mistakes in their semantic descriptions. OK, but... would it produce an scenario worst than the actual?. Now, fake webs are quite common; irrelevant sites try to advertise them by using all the available means to attract the most visitors, misleading them. The best web search engine is this who best filters these sites in the searches. In a semantically described Web, the problem will be the same, but there would be another easy-to-use filtering criteria to enhance the results. the Web search engines' algorithm will be better for sure.
  • Too complicated (Score:1, Insightful)

    by Chris Graham (942108) on Wednesday July 19 2006, @04:51AM (#15742000)
    (http://ocportal.com/)
    There is no way that regular people, even the majority of intelligent educated people, are going to be able to use it. It's a ridiculous pipe-dream. Think how hard it is to get people to understand broken-down logical arguments where everything is already layed down for them, and now imagine trying to make them understand how to conceputalise their own data domains and define their own relationships. Maybe 2% of people could do it properly, and then 1% of those would end up in a profession that would use the skills.

    When programmers write software for general use we have to think how to make things easy multiple levels below the level we have to think at. The vast majority of people are not able to think technically, and do not have patience - and that's because most people in this world find it uncomfortable to do anything that isn't centred around a social or emotional act.

    Developers find users can't do programming, so the programming language becomes a graphical interface. The users can't navigate the graphical interface via a structure based on logic, so the screens get built into an icon based organisaion with a well-defined 'workflow'. The user can't think logically about how to use the graphical interface, so help is written to explain how it works and what it can do. The help is too general so specific examples are given. There are too many examples and the user can't be bothered to read them, so a colleague stands next to them and they learn to mimmick their colleague.
    This isn't an extreme situation - it is typical of the vast majority of users. Now think about the inherent technical complexities of OWL and RDF, and imagine people actually using it for real problems? There's no way to hide what is a purely logical and structural framework for organising extensive data, behind pretty pictures and simple examples.
  • Blame the user (Score:2)

    by sco08y (615665) on Wednesday July 19 2006, @05:16AM (#15742066)
    Brilliant! Blame the user. No, it's not that you don't have a rational data model (you know, so that those "semantic" tags actually *mean* something) or that you haven't done squat to even suggest a proper UI, it's the user's fault.

    And it *certainly* couldn't be that HTML is a piece of fucking garbage and that trying to kludge semantics into the spec is an effort doomed from the beginning.
  • Insult humanity (Score:1)

    by MADnificent (982991) on Wednesday July 19 2006, @05:52AM (#15742166)

    So instead of insulting a very small amount of people, he insults everybody in the world...

    Way to go!

  • Some well-known researcher called the emperor naked. Maybe they believe him more than they did the practicioners that pointed the Semantic Web's problems out long before. Here we'll see that fairy tales are not true -- a small child is not sufficient, we need a bigshot to notice.

    News at 11...

  • by saddino (183491) on Wednesday July 19 2006, @07:40AM (#15742502)
    Although trust is certainly a issue when it comes to the Semantic Web, the real problem is that its design is not a true abstraction, but is nothing more than more metadata. And like the actual textual data in a typical web page, it suffers from all the same problems, save for one: being unstructured (and thus not truly parseable).

    IMHO, the Semantic Web is solving one problem (the lack of structure and descriptive context in textual HTML content) in a very hard way (asking the entire web to implement this new RDF).

    Many companies (disclaimer: like my own [q-phrase.com]) are approaching these problems from a different angle: working on statistical and semantic systems to extract structurue from the textual content that is already there on the web page.

    Now some people will argue that trying to create a system that can understand langauge/content is insanely difficult.

    But what is a more realistic time frame? The one in which an intelligent parser can begin to understand the content that is already on the web, or the one which requires the entire world to implement a solution to a problem they don't even realize is a problem?

  • pardon my ignorance (Score:3, Interesting)

    by plopez (54068) on Wednesday July 19 2006, @08:24AM (#15742735)
    But what, exactly, is the definition of the 'Semantic Web'? How is it different from what has been done in the past? Is there any agreement of any sort as to what it means? If yes, please let me know. If not, then how can we achieve this goal if we do not know what it is?

    I am confused, I really do not see too many differences in the web in the last few years. Nothing 'Earth Shattering' anyway.
  • Old AI vs New AI (Score:2)

    by Alomex (148003) on Wednesday July 19 2006, @09:13AM (#15743036)
    (http://slashdot.org/)
    The Semantic Web is in the old AI tradition of grand overhyped promises with little results to show for them many years later. AI had managed to moved away from this practice that had led to the crisis in funding in the 80s, when people woke up to the fact that AI did not deliver as promised. Here at AAAI there is a sentiment that the semantic web is a step in the wrong direction and Tim Berners-Lee talk here was presented as such. Here's the abstract from the program:

    The relationship between AI and the semantic web has been something that has provoked a lot of heated corridor discussion over the years. This talk will try to outline what the semantic web is and is not, at a conference where there may be some anniversary reflection on what AI is and is not. It is not always obvious how to transfer existing AI techniques into a fractal weblike space, or what the effect will be. But it is certainly exciting.

    The same chasm appeared during the founders panel, where John McCarthy gave a more sober cautious perspective of where AI is going, while Marvin Minsky issued a call for the old style of over-hyped research such as the "emotion machine" whatever that means.

    There was also a feeling that perhaps some of the grand challenges are too ambitious. "We can't make predictions since in some cases we don't even know what the problems are" a famous panelist noted. It is good to have long term goals, but they must be set within the realm of what is at least vaguely foreseeable. Challenges beyond that boundary are in the realm of science fiction not scientific AI.
  • ... are still unsolved. The problems of data inconsistency (from bogus or fraudulent data entry) are bad enough, but the semantic web idea has problems even if you assume all the data is valid. There are some theoretical results on inheritance networks (a classic AI predecessor to semantic web representation) from the 1980's and 90's that are rather depressing:
    • Touretzky's dissertation where he shows that if you allow exceptions, it's hard to keep inheritance networks globally consistent
    • Another result I can't locate right now which proved certain basic inferencing techniques in inheritance networks to be NP-complete

    That last one means that straightforward, guaranteed-to-reason-correctly searchers for semantic webs won't scale, which means their use on the global internet is problematic. Failure to scale was one of the major causes of AI winter, guys.
  • by goat_roperdillo (984552) on Wednesday July 19 2006, @10:38AM (#15743667)
    Berners-Lee's insistence that his Semantic Web ideas can work has kept him in a backwater of the WWW. Berners-Lee developed the SW ideas early, without outside critical thought and the SW remains a pipe dream.

    The basic problem with the SW is that the use of separate ontologies defies any exchange process that does not include human intelligence. IOW to do it properly there must be human intervention. But Berners-Lee keeps thinking that there is a shortcut - there isn't. Better men have trod this path and know what's at the end of the road.

  • by alucinor (849600) on Wednesday July 19 2006, @01:28PM (#15745025)
    (Last Journal: Sunday February 05 2006, @06:11PM)
    If we want to get users to enter in metadata, we need to do three main things:

    - create editors that automate the syntactical complexities of RDF/OWL, like what blogs have done for HTML.
    - make entering metadata entertaining somehow.
    - make some killer apps that show to regular users the usefulness of the semantic web.

    Then we'll have a semantic web. Problems like spam can just be addressed as we come to them, but Web of Trust is probably a good start.
  • All I would ask of the Semantic Web evangelists is that they go off together and build a network of Semantic Web systems that proves the following: 1. It works 2. It is more useful than currently existing practices 3. It is more cost-effective than currently existing practices 4. There is a killer application that is not possible using currently existing practices If they can do #1 and any one of the other 3, then maybe people will see the value and start adopting it in the real world. Until then Semantic Web sounds like the Dvorak keyboard to me: a "solution" that everbody thinks is worse than the problem it is trying to solve, because it requires millions of people to change the way they do things without much proven benefit. What Semantic Web needs to do is prove is that it is actually worth implementing by showing some honest results.
  • Google seemed to be looking to the web for meaning, but they should be building their own Ontology of Everything, based on what they find in the content. Let Cyc loose on there caches perhaps would be a good start. Then integrate their Ontology of Everything with those Formal Ontologies that already exist. About Intelligent Searching, when a person asks me for advice, I tap into my Personal Ontology, which has overlap with other ontologies in a domain specific way. i.e. I read information, much of it structured, I then fit it into my Personal Ontology and if required expand my ontology to fit the new information. I may even face a paradigm shift that requires a major restructuring of my Ontology, i.e. I need to have a set of new transforms to link the old with the new in a way that lets me sanely access both. At this point I have acquire new Knowledge which I can now share with people that ask me questions. When I'm talking to a Knowledgeable Source I need to find the Transforms that allow me to incorporate "knowledge of shared knowledge" as well as knowledge of our unique knowledge. This is how we are able to communicate and learn from each other. If I am dealing with a Naive Searcher I need to Probe their Personal Ontology or World View until I am able to construct enough domain specific transforms to allow me to know what they are trying to learn and how best to find it and Teach it to them.
  • Re:nifty! (Score:1)

    by The_Cheese_Stands_Al (940215) on Wednesday July 19 2006, @12:56AM (#15741472)
    (Last Journal: Wednesday June 07 2006, @08:08PM)
    Well, I would say the average /.er would classify as competent. Think of your average myspace denizen, or someone of that nature. On another note, Norvig seems to have put his foot in his mouth. Nice recovery, though.
    [ Parent ]
    • Re:nifty! by nolsen (Score:2) Wednesday July 19 2006, @01:05AM
  • Re:Wow, that was confusing (Score:1, Offtopic)

    What's a semete?
    [ Parent ]
  • 6 replies beneath your current threshold.