I will say, I'm disappointed by the comments I've seen here on slashdot.
Best comment came from an anonymous coward about the pining for an "emergent" type system, the fact that we're not wired that way, and that while more power gives some more in the way of degrees of freedom, it doesn't mean that everything can be analyzed together... you have to have some way of focusing (and a pretty darn good one to prevent unimaginable problem blowup).
Bootstrapping works well when confined to a fixed arena with observable and unambiguous criteria for selection of behaviors or incorporating a piece of knowledge and observable and unambiguous criteria for judging the success thereof. That is to say, a tight focus and goal directed behavior. Without these and a tight feedback loop, the resulting system tends to disappoint.
Having as your scope, reading the web to gain an understanding of the world is um... just a bit outside that template for success. While the big talk may be a pre-requisite for grant interest, I doubt have nearly as many illusions as the average slashdot reader. I hope their work goes well, and I hope some of their techniques for extracting information from the web prove useful. That said, it looked like their initial target was classification only. Not trivial, but a very small part of the puzzle of intelligence to say the least, especially when you consider the fact that the classifications this thing will suck in will reflect mostly the sort of classifications that we don't take for granted.
And here I'll start reflecting my bias. I am a former #$HumanCyclist (I did an internship about 10 years ago), because even though I am in some ways disappointed, I do think that the fact that they're actually building something (and along the way have been solving problems with it) and have been for a lot of years means that there's a lot to learn from them.
Among the things the Cyc project has shown, is exactly how important these sorts of unstated classifications turn out to be in the problem of doing even the most mundane things right. But there's no point dwelling on that, because even assuming you have some impossibly large beautiful graph reflecting a really solid and well thought out classification of everything, from every angle (hahaha), you're nowhere.
Facts are fuel... the engine is the rules. Reading those from free text is a very, very dicey proposition, both because the parsing is infinitely harder, and because much more so than facts, they're largely unstated and in terms of our own learning, inferred from examples. You can set up probability matrixes or the like, but only if you know what you're evaluating for (how would you program "curiosity"?). Even if you do get those matrices, reasoning with them directly is pretty much impracticable, so you have to have to make some arbitrary decisions about when you're confident enough to say you "know" something. This is just really, really hard knowledge to get in any automated fashion.
Finally, for both facts and rules, the consequences of incorporating a poorly considered one can be quite dire, and there's no practiceable way (as the amount of knowledge grows) to know whether it's consistent with what is considered true to that point.
Getting even more slippery, there is no one context or frame to consider everything in. This goes equally well for facts and rules. You could try and split hairs and say that given enough antecedents, your facts and rules are solid. However, as any kind of remotely practical matter, you need a way of accumulating and organizing these antecedents, and that's true from both from an technical (engine execution), and practical (reasoning and learning ease) perspective.
Oh, and as a minor matter, languages are difficult enough from a syntactic dimension, and the symantics of it (in order to understand a statement, you have to understand the ones prior, the context or framing that may have switched, the built up assumptions that maybe can be discarded, maybe not, etc...) make for a truly fantastically dificult problem.
Finally, not everyone uses perfect grammar, is perfectly informed or even tells the truth all the time.
So... you might say the problem is a bit tricky :)
The solution may not look like Cyc very much, I don't really know... that's just the lens I've spent more time looking through than any other. I will say that I'm deeply skeptical of the ability to engineer higher order general case reasoning with neural nets because there's no way to understand intermediate states and do debugging, but if someone shows me it works, then great.
I wish this group luck in making meaningful incremental progress... mainly I think in the direction of techniques for extracting info from the web rather than on the general problem, but every bit helps.
I wish the same for my former employer who I believe has made meaningful incremental progress many times.
Just because nothing big enough to live up to the overblown expectations of the field has happened doesn't mean nothing is happening.