Follow Slashdot blog updates by subscribing to our blog RSS feed

 



Forgot your password?
typodupeerror
×

OpenCyc 1.0 Stutters Out of the Gates 195

moterizer writes "After some 20 years of work and five years behind schedule, OpenCyc 1.0 was finally released last month. Once touted on these pages as "Prepared to take Over World", the upstart arrived without the fanfare that many watchers had anticipated — its release wasn't even heralded with so much as an announcement on the OpenCyc news page. For those who don't recall: "OpenCyc is the open source version of the Cyc technology, the world's largest and most complete general knowledge base and commonsense reasoning engine." The Cyc ontology "contains hundreds of thousands of terms, along with millions of assertions relating the terms to each other, forming an upper ontology whose domain is all of human consensus reality." So are these the fledgling footsteps of an emerging AI, or just the babbling beginnings of a bloated database?"
This discussion has been archived. No new comments can be posted.

OpenCyc 1.0 Stutters Out of the Gates

Comments Filter:
  • by Anonymous Coward on Thursday August 10, 2006 @12:34PM (#15881951)
    So are these the fledgling footsteps of an emerging AI, or just the babbling beginnings of a bloated database?

    Cyc is a fledgling AI, depending on how you count "AI". Then again, so is my thermostat. My thermostat "knows" how to keep the room the right temperature. Cyc "knows" about a great deal of conventional human background, just like a database with a query system "knows" how to give you the data in that system.

    The real question is not "is this AI", but rather, is it useful, and if so, to who? I think Cyc has the potential to be quite useful in some areas; we'll see how far it goes, and what the limitations are in time.

    Right now, I think the real problem with Cyc is understanding it on a practical level, and getting an understanding of what it can do in practice, not in theory. When I last looked at the project nine years ago, they were just starting to open up things a bit, and it sounded like someone who understood the project might make great things happen. They don't seem to have yet; but who knows... perhaps in the future.

    Now that OpenCyc is finally released, the most important steps to get people using it is to drop the learning curve down to a reasonable level, so that developers can start playing with it and find out what it can do without committing their lives to the project...

    We'll have to see what happens: Cyc is a big (bloated?) database that's also a fledgling AI -- the real question is, what cool things can we make it DO? Time will tell...
  • by nahgoe ( 901302 ) on Thursday August 10, 2006 @12:44PM (#15882055)
    The funny thing about common sense is that it is not common!
  • by Jerf ( 17166 ) on Thursday August 10, 2006 @12:46PM (#15882071) Journal
    If you could build a Cyc-like database simply by feeding it a large amount of more-or-less unstructured text, then the Cyc project wouldn't have been necessary in the first place.
  • by natedubbya ( 645990 ) on Thursday August 10, 2006 @01:30PM (#15882510)

    You can't compare Wikipedia to Cyc. If you do, then you are just misunderstanding what Cyc is and what it is not. Cyc is a database of logical relations representing common sense knowledge. It contains something like 20 different meanings of the word "lie" and such things as this. It is not concerned with knowledge of popular culture, but rather the underlying semantic rules that we use to talk about things such as pop culture.

    Completely different.


  • by Jon Peterson ( 1443 ) <jon@@@snowdrift...org> on Thursday August 10, 2006 @02:02PM (#15882852) Homepage
    That's not a solution. Are you saying that Vampires exist in Dickensian London? Are you saying that, in the real world, Dracula _isn't_ a Vampire??!!

    And that's the tip of the iceberg.

    Is powdered milk a dairy product? Can whales sing

    I work with ontologies. There are too many contexts, and they are not well defined. You can't reduce human knowledge to an ontology and still have it as being of any use to anyone. Cyc will fail, or, it will succeed and we will have failed.

  • by johndcyc ( 864700 ) on Thursday August 10, 2006 @02:02PM (#15882855)
    Yes, we are. It will probably be published next week. OWL, specifically.
  • Meanwhile (Score:5, Insightful)

    by DrYak ( 748999 ) on Thursday August 10, 2006 @02:27PM (#15883145) Homepage

    move to a variant of SemanticWiki [...] If semantic statements become the standard, Wikipedia can be queried, which means that Cyc could be fed the data automatically.

    Meanwhile google happily eats whatever crap its spiders manage to find and thru some hacking and dark magic algorithms is still able to give not so meaningless answers to not to much badly worded queries.

    That's a key point explaining why OpenCyc came too late. Wordnet [slashdot.org], Thoughtreasure [slashdot.org], Cyc [slashdot.org] et alii all share a set of common drawbacks. Their input data need to be specially formated. That's why all those overly ambitious project have progress so slowly in the past years, and are still only limited to answers precise non-ambous simple question like "Is a cat a mamal ?".
    This is linked to their fundamental design around a solid, non-flexible, pure logical architectures (reading their repective Wikipedia entries help understand how they work). In a way, the scientist behind those projects tryed to apply the same kind of language logic that is used in maths and programming languages to human language, and while this may be usefull for some academic purpose or very specific application were some reasonning may be useful (which has been used and applied well - I've seen it at least for WN and TT), they don't scale that well to REAL-WORD(tm) situations.
    Their fundamental structure clashes with reality of human reasonning : WordNet is limited to single non-ambigous meaning for terms (no things like "nut" as in the seed, and "nut" as in the thing that can be screwed on a bolt). Other "stuctured" designs clash with real life's fuzzy nature with the other softwares.

    Meanwhile search engines have grown in a completly different way. Initially they were designed only to scan pages content and then index their keywords for later queries. Only after that, slowly, one hack after another, they where tuned. In order to make results more revelant. In order to avoid link farms. Finding some complexe strategies in the ranking calculation to return more correct and more meaningful. To find results not with matching keyword, but with related keywords (Google's "Keyword is encountered only in page linking to thig target"). To cope easily with bad spelling (something that is very common in the real life. Something that is difficult to even detect for a common-sense engine. something that is very intuitive in search enginges, and that is even more optimisable given the statistics that such engine can do). And lot of other small ponctual improvement.
    And slowly, by on one hand having a system that gets each day a little bit more optimised, and, on the other hand, an incredibly huge corpus to process that grows at a very fast rate, the search enginges, like google, become fantastic multipurpose information retrieving tools.
    By now, you can type crap in google and still get something (as long it's not a "google-sepuku" like of crap, but more of "I'm very clumsy with my wording and my keyboard-skills"). You can have also other wonderful information [blogspot.com], including stats on spelling errors [google.com] or even statistic based translation [blogspot.com] (that are otherwise very difficult to get by classical mean), static about currently hot topic [google.com] (which can be fed back to improve results for ambigous queries).
    All this because search engines are built around a fuzzy logic : at the core is a braindead simple indexing rule, slightly modified by a bunch of hacks.
    Such fuzzy logic approach "without really needing to teach the machine everything" has been recently successfully used on

New York... when civilization falls apart, remember, we were way ahead of you. - David Letterman

Working...