Forgot your password?
typodupeerror
The Internet

Berners-Lee On The Semantic Web 112

Posted by timothy
from the build-it-they-will-come dept.
An Anonymous Cowarnd writes: "Wonder about the future of Internet communications? In a new article on the Scientific American website, Tim Berners-Lee tells you what to expect. If you don't know who Tim Berners-Lee is, go ask CowboyNeal." Coming from the guy whose work spawned the WWW, this is some speculation worth taking seriously; the article addresses applications of a more integrated Web and explains some of tasks necessary to make it happen.
This discussion has been archived. No new comments can be posted.

The Semantic Web

Comments Filter:
  • by Anonymous Coward
    ...his introductory example makes me giggle. We have a service that gets data from these insurance companies. Most of them are still on mainframes running cobol. We do edi with some of them. You may have heard that xml is replacing older formats like X12. Well, a lot of these companies are just now moving to X12. By the time this industry gets everybody moved to some global semantic web system, we'll have so many nanobots in our blood we won't need doctors anymore.
  • by Anonymous Coward
    Like "problem" and "opportunity" the terms both mean (denote) the same thing, but have rather different associated implications and feelings (connotations).

    If you think of "http://slashdot.org" as an address you feed to an HTTP client library which will open a TCP connection to the IP address which the DNS gets in doing a query for "slashdot" in the "org" domain -- then it's a URL.

    If you think of that string as identifying something which the web infrastructure, with local caches, remote caches, a distribute page generating infrastructure, etc, etc, can probably make be displayed on your screen, then it's a URI.

    Since that's how the web really works, calling it a URI helps show you know how the web really works, or something. Alas, it also goes against a lot of people ingrained language.

  • by Anonymous Coward

    Tim Berners-Lee (from www.everything2.com [everything2.com])

    "The creator [everything2.com] of the World Wide Web [everything2.com]. Probably had no idea [everything2.com] that his scheme [everything2.com] for presenting physics [everything2.com] research [everything2.com] would be used for fishcams [everything2.com], porn [everything2.com], or Everything [everything2.com]."

    or

    "The man started with grand ethereal visions; he uses the phrase 'World Wide Web [everything2.com]' to mean 'the universe of information'. His approach to getting there on the other hand was extremely down to earth: in practice, the Web is a simple and practical methodology for document exchange over TCP/IP [everything2.com], based on a new universal Internet document addressing method, the URL [everything2.com], a new TCP/IP protocol, HTTP [everything2.com], and a new document descripton language, HTML [everything2.com], and it reached the world in the form of a functional range of software tools, originally programmed on the NeXT [everything2.com] platform in Objective C [everything2.com], later ported to C [everything2.com] to work on other platforms.

    His team's combination of very high reaching ideals and a very practical approach to implementation, later shared by other Web pioneers, accounts for its enormous success.

    I will never forget the sight of him at one of the early WWW [everything2.com] conferences, where thousands of people, including the big guys from some of the big software vendors and research labs, and people like Ted Nelson [everything2.com], had come to his workplace, the CERN [everything2.com] lab in Geneva, to share the excitement about this new world of interlinked information that once had existed only in his own mind. He was nervous and seemed pretty much overwhelmed by the whole event. It's exciting to see a man's wildest dream become reality!"

  • Seriously, we should thank for putting some pieces together with HTTP and HTML. It was the right idea at the right time. But it sure doesn't mean everything he thinks of is going to be just as successful.

    I'm not happy with how much everything he seems to put his hands on is assumed to be the next big thing(tm). It's sort of like how the people who used to run Netscape (the original F*cked Company) seem to think they can turn any idea they come up with into a similarly 'successful' corporation.

    http://news.getschooled.com/ [getschooled.com] is for the easily amused
  • Human systems of thought and morals have changed with the centuries. Always language and culture express them. Are we moving toward global definitions of existence and meaning dictated by computer vendors (re: ontologies) ?

    How will the hackers respond? :-)
  • Grand ethereal visions"?? What a crock of shit. He modified SGML to create HTML ...

    Sour grapes? The idea behind a "World Wide Web" did come from Berners-Lee, "hypertext" at that time was theory - we were all "using hypertext" in 1990? I don't think so. Existing tools & protocols were not easy to use - gopher did not hit the spot. He made simple, scalable advance with http/html and encouraged the development of servers and browsers. What he started as a result was something we needed, if not "visionary". The fact that he did it for some unrelated purpose is no surprise - that's how most great inventions happen. Read his book, then comment.

    Linus Torvalds also created something for his personal interest, but what it started was also something we needed, or "visionary".

  • Among the many problems mentioned by other posters there is, in addition, the problem of a lack of sensory input. It is attractive to think of the Web as a substitute for machine vision and audition, but the example in the article has embedded in itself one reason why this is not enough: How quiet is quiet enough for a phone call? It's fine to say "Turn down the volume." But without an integration of knowledge of how quiet is quiet enough for a phone call for this person, and an actual measurement that things are getting quiet enough (and possibly the blender and/or exhaust fan ought to be slowed or stopped as well), you really do not have a system that is practical, even for turning down the volume.
  • In the 70's and 80's, researchers spent years developing "semantic webs" to represent small domains: block stacking, physics problems, locomotive repair, and, ambitiously, medical diagnosis. These databases of facts and rules were fed through inference engines which would seek to combine them to solve problems.

    Despite careful hand-tuning, none of these systems ever achieved any practical use (witness the paperclip guy). Why? Because they're too hard. Getting the data structures to make sense and behave consistently, even in a small system, is too tricky and unreliable. On the web, it's impossible. Humans have a very hard time understanding and agreeing on what data, even "unambiguous" semantic data, means.

    The semantics of computer programs, expressed in an unambiguous language, are constantly going wrong, or at least beyond what is intended. The far more complex semantics of real life and the web are going to be much more difficult to manage.

  • Yes, it's true that MYCIN did reasonably well, but no one has ever been treated solely from a MYCIN diagnosis. The reason is simple: malpractice lawsuits. I don't think it would even be allowed to be used in conjunction with a real human due to the possibility of misuse.

    AI is a great field, and we shouldn't generalize too much about the success or failure of every application. However the lessons of these experiments are clear: logic bases are very difficult to construct, and they don't tolerate mistakes well. Having a large number of people putting in their own semantic interpretations will not result in a useful corpus.

    In fact I would say that one branch of AI, language parsing, would probably outperform a user-input semantic base by a large margin. For one thing, it would allow new semantic assertions to be created by "re-compiling" web pages as needed.

    For instance, if you want to construct a database of businesses with their hours of operation, it would probably be easier to search web pages for patterns like "We're open from {starttime} to {endtime}" than to convince a bunch of webmasters to enter the data for you.
  • Too much technology

    There is no basis for your premise whatsoever.

    Virtually all the evils of the world are the product of human will, not of technology. People use technology to cause suffering of course, but to blame the technology is obviously misguided; at worst the blame lies with particular toolmakers, and in every case it lies with the toolwielders. Once in a while something technical does break and causes suffering by itself, but compared to the suffering inflicted by Man that is utterly insignificant.

    Was there less suffering before science and engineering started transforming the world in a big way? I don't think so.

    Technology empowers everyone, including those that care not about the plight of others, but in the current makeup of the world that translates to vastly more good people being empowered than bad people. Whether or not that is a factor, you really can't look back over history (first removing the rose-tinted spectacles) and claim that human is good, technical is bad.

    In technology there is ample promise for completely eradicating or bypassing or overcoming human evils --- ultimately by keeping everyone at arms length if all else fails. In human development, the chances of finding a viable solution of any sort seem to be rather less than zero.
  • Prisoners are isolated for punishment... We are isolating ourselves for convience?

    I think you may be missing the point entirely. We're isolating ourselves in order to forge liaisons and interact with others at our own personal convenience, rather than at the convenience of others. Amongst other things, it empowers us to interact with multiple people and multiple communities, increasing the level of human dialogue which you hold so dear. And before long, hopefully that dialogue won't be limited to conversing with humans alone. As machine IQ rises, old "natural" humanity will at some point become very second best for those desiring a fulfilling intellectual relationship.

    Prior to technology, we had no option in any of this. Now we do.
  • this is precisely the sticky point: malicious, subversive, or even just plain incompentent ontologies. (it used to be you could only find management that was like this, but now software is acquiring these traits, sigh.)

    example: ((send) (the funny clowns) (to (party address))) w/ the cdaddr misparsed as "party dress".... now we have big shoes and red noses poking about at the fashion mart as well as the post office. ugh.

    on the other hand, i'm glad tblee is moving away from simply syntax. good for him, good for everyone.

  • I view this type of services as a "super" yellow pages.

    But when you look at current yellow pages, it won't be easy!
    For examples, you can find a location of a shop on yellow pages, but you can't find its OPENING HOURS!
    If you're lucky, there is also the address of a webpage which is usually obsolete and very UNinformative..

    So maybe in the future, those agents will be everywhere, but first the "impedance mismatch" between the real world and its representation in the virtual world must be reduced..
  • Ted Nelson patented the hyperlink in 1960.

    (as I recall, it was linked to in a recent /. discussion, but I'm TLTLIU.)
  • "[the CYC people] are about to release some of the project after 17 years of development..."

    Exactly. 17 years later. Perhaps the most powerful of TB-L's ideas here is to accept inaccuracies. If you insist that everything be consistent, well-defined, centralized, and proper, you're back to the morass of AI, and nothing useful ever gets out of the lab.

    The same thing was true with networked hypertext in general, which is why Xanadu never get off the ground and why Tim's WWW took off exponentially.
  • "This utopian information access idea is great in principle but at the moment we just don't have the always on style internet access available."

    There are many hurdles between us and the Semantic Web, but lack of always-on connectivity is a small one. Do people use the non-semantic web today, despite not having a gigabit device in their ear all day? Of course. People send email, make Priceline requests, send off EBay bids, and go on with their day, checking back later. Many other services (search engines) are fast enough that people can execute a whole transaction in a few seconds at their desk.

    The Semantic Web merely expands the range of services we can ask the Web to handle for us. If it were here today as envisioned in the article, I would use it all the time on my broadband connection. If I only had 28.8 dialup, I would use it when I was in the mood to dialup, the same way I approached anything else on the Web back then.
  • Try oingo.com. They claim to be able to search semantically using an ontology.. Its not quite google yet..
    Fun searches are "chips" "jaguar" and any other vauge term...

    I work for a company trying to do the same in medical field.
  • I remember Cyc ... it was built when the Japanese were beginning their "5th generation computer language" project, that was supposed to enable giant robotic mechs to wander the landscape breathing fire, right? As I remember their project went nowhere.

    I wish Cyc would release *something* playable-withable soon... I mean come on, it's been *17 years*!!! Put that thing to work checking my IRS forms!
  • by streetmentioner (28936) on Wednesday April 11, 2001 @12:36AM (#299610) Homepage
    Some systems exist to extract facts from language into semantic knowledge representations, and they're surprisingly good.

    SNOWY is a system that "reads" the World Book Encyclopaedia and stores each fact about a concept into a hierarchic memory based on that concept. It's sufficiently sophisticated to be able to realise that "The bear digs up the nut" implies that the bear eats the nut, while "The miner digs up the coal" doesn't imply that. You can then ask it "what eats nuts" and it will reply correctly. (At least, this is my impression - I haven't used it, sadly.) As I remember it can fully understand 50-60% of the sentences in the bits of the encyclopaedia that it has been commanded to parse.

    The language it works on is fairly simple, but is nevertheless text designed for humans as opposed to computers. Systems like this could be a good bridge between language and semantic based representations.

    This is the best link I can find, unfortunately. [ucf.edu]

    There are also, of course, dozens of systems designed to work on English text that has been specifically created to be computer-parsable, but still readable by humans.

    I'm incredibly sceptical about all this sort of technology, but if the systems continue to evolve, the agents might be able to glean much of their knowledge from existing web pages.

  • I doubt it will be able to find my father's brother's
    Fathers brother, that's your uncle...
    nephew's
    Uncle's nephew. Probably this is just YOU, but it could be any of your siblings I guess, or cousins.
    cousin's
    Cousin here gains you nothing, all the uncle's nephews are each others cousin.
    former room-mate
    We're just looking for someone who shared a room with you, your siblings or your cousins then. Hope that makes things at least a bit simpler.
    Pre......
  • Great link, thanx. I read about Cyc and Lenat in the book 'Out of Their Minds', a whole chapter devoted to it/him.
  • by s390 (33540) on Wednesday April 11, 2001 @01:27AM (#299613) Homepage
    because that light at the end of the tunnel is a train called "pervasive computing" and it will be here soon. Ubiquitous connectivity, XML/SOAP protocols (assuming M$ doesn't hijack these), more capable and standardized interfaces to extensive backend data warehouses, IPv6 addressability and service level discrimination, smarter Java-based intelligent agents, speech recognition, natural language processing - these will all contribute to the second networked revolution in the ways we work and interact online. Berners-Lee has an academic vision of how some of this might work, and I applaud his courage for sharing his ideas.

    Today, you can be driving on the freeway and using speech recognition to look up and call colleagues through your handsfree cellphone. It's not much of a stretch to add calendar administration and other interfaces with intelligent agents to this.

    Scenario: You're flying down I405 in SoCal (in the carpool lane, with coworkers aboard) some beautiful late afternoon in the not too distant future:

    "Princess (you've named your general digital assistant Princess Leia, for some reason), please check on improving my car insurance rates."

    Princess: "Connecting insurance agent..."

    Fred (you call your intelligent insurance agent program "Fred"): "Fred here..."

    You: "Fred, please see if I can get a better rate on my car insurance this year."

    Fred: "OK, I'm looking..."

    You: "Princess, please tell my wife I'll be home early."

    Princess: "What's your ETA, please..."

    You: "Sixish Princess, thanks."

    Princess: "Thank you, will do."

    Fred: "You have six quotes, two of which are at lower premiums than your existing contract. Do you have any recent tickets or accidents to add?"

    You: "No, thank you. What's my best choice?"

    Fred: "Suckem-Dumpem Mutual offers you a $300 annual premium savings counting the good driver discount."

    [I405 stops dead as it's wont to do randomly, including the carpool lane.]

    Screeech Crash Tinkle. [a moment of dead air...]

    You: "Fred, forget it. Princess, please tell my wife I'll be a little late."

    Fred: "Request closed, no action. Bye."

    Princess: "OK, will do."

  • by Azza (35304) on Wednesday April 11, 2001 @01:12AM (#299614)
    Excellent article. I agree this is the way things should be heading, but the biggest problem is going to be defining standards for the information. The largest information providers will be trying hard to hold on to, and control access to, their so-called intellectual property.

    The semantic web depends on universal open standards for access to this information. MS's solution, hailstorm, already tells you what they think of that idea. Let's hope that we can avoid another browser (agent) war...
  • I was quite impressed with the article. It is very brave to advocate a system for achieving such goals without requiring the imposition of standards and review in order to verify that the information is of value.

    My question would be: how would this semantic web deal with the problem of garbage in/garbage out? If anyone can author anything, what's to help the agent distinguish a real person, place or thing from any fictitious counterparts?

    I would liken this problem to that faced in the use of digital certificates. Certificate Associations (like Verisign) needed to be created so that people could have a common point of reference that they could trust as reliable. I can set up a computer as a CA (easy to do with Win2K for example), but there is no good reason why you would trust my CA to issue certificates to verify the identity of some third party that you want to trust. Afterall, who the heck am I?

    Similarly, the garbage in/garbage out problem seems to beg for some commonly trusted authorities. I don't read the National Enquirer for news because I don't believe it to be credible. But I will trust what I read in the New York Times. How could an agent under this semantic web scheme make similar distinctions?
    ----------------------------
  • This is why I'm absolutely 100% certain that we'll all learn Lojban soon. Yup, there is no doubt in my mind. None at all...

    Didn't they say that about Esperanto? The onlyplace I see that now is on re-runs of the early Red Dwarf series.

  • I'm not sure that I'd use the semantic web if I was using a 28.8 modem - it would have no great appeal to me as the who urge to move to the semantic web is that agents can sit in the background and set stuff up for me - without the always on connection this segment of the semantic web is almost useless. I want my PC, my Palm and my phone to tell me stuff and to remind me of stuff without having to line up IrDA ports or place one in a cradle.

    It's these wired and connection based boundaries that will cause the problem. What we need to develop first is the "two way web" which can remind or push information to you through whatever device you are logged onto. Then we can start to make the web machine readable and get the sematics into the two way.

  • by matthew.thompson (44814) <matt AT actuality DOT co DOT uk> on Wednesday April 11, 2001 @12:21AM (#299618) Journal
    This utopian information access idea is great in principle but at the moment we just don't have the always on style internet access available.

    A similar idea is being touted by Orange [orange.co.uk] whose grand plan is to use an always on mobile terminal device with their Wildfire [wildfire.com] personal assistant who will listen to your day and arrange things to happen, inform you of information and collate calls and messages wether they are voicemail, email or faxes.

    But until we have the always on alway connected devices we're still going to be pretty much tethered to our desks.

  • The Manufacturer and Builder Volume 13, Issue 1

    January 1881

    Pneumatic Tubes Supersede Cash Boys

    The incessant calls for cash boys, which formerly made shopping in our larger establishments so wearisome, if not exasperating, were silenced and the terrors of shoppers greatly mitigated by the introduction of electric calls. An enterprising Philadelphian, Mr. John Wanamaker, has gone a step further, and displaced the dusty skurrying of cash boys and cash girls by a system of pneumatic tubes. Under the new system, an inspector and wrapper is stationed at each counter, who will receive the money and goods the seller's check. While goods are being wrapped up, the cash, with the proper vouchers, will be transmitted toa centrally located cashier, who will return the change through the proper tube. There are two such tubes leading from each counter to the cashier's inclosure. One of the tubes is to carry the money to the cashier, and the other is to return the change and accompanying check to the counter again. The "carriers" which work inside the tubes are little cylindrical boxes of sheet steel, line with green baize, and protected at each end by diminutive felt cushions. Each carrier is of the exact diameter of a silver dollar, and is capable of holding thirty of the latter pieces, or a much larger sum. By means of steam engine and exhaust pump in the cellar, with proper attachments leading therefrom, the air is being constantly exhausted at the cashier's end of the tube and at the coutner end of the tube of each pair, and when a "carrier" is placed in the mouth of either tube, it is immediately drawn to the other end, and is there delivered automatically by an apparatus devised for that purpose. This system not only saves time and noise, but the wages of an army of boys or girls, besides discharging a large amount of fresh air into the building, greatly improving ventilation.

    Pneumatic tubes, the Amazing Revolution of the late 19th century! Why, it's "pneumati-commerce"! And it even freshens the air! Does e-commerce do that?
  • > if the systems continue to evolve, the agents might be able to glean much of their knowledge from existing web pages.

    April 1, 2038: SkyNet gains sentience, having gleaned most of its knowledge from web pages.
    April 2, 2038: SkyNet proclaims f1rst p0st, d00d, and promptly goes into a coma fantasizing about h0t gr1tz and how all Natalie Portman's daughter are belong to it. Humanity doesn't notice.

  • Bah! That's been going on since the early '90s! Remember this guy?

    "UN-altered REPRODUCTION and DISSEMINATION of this IMPORTANT Information is ENCOURAGED, ESPECIALLY to COMPUTER BULLETIN BOARDS."
    - Robert McElwaine, net.kook extraordinaire...

  • Why is it that everyone these days is ranting about how XML will change the world, when SGML, despite being around for god-knows-how-long, has quietly done its job yet not attracted any of the hoopla it's XML cousin has, nor become some kind of Grand Unified File Format that the XML-advocates promise?

    Without a set of standard schemas, XML will be just as difficult to index as free-form ASCII documents, since i might write:

    15 My Street, My Town

    while someone else might write

    My Street

    and someone else might write
    15My Street

    or any other variant, none of which are directly comparable with each other without some translation layer which is as much of a pain in the ass to write with the uber-ugly XSLT as it is with Perl.

    I guess i just fail to see what XML gives you over
    ASCII with regard to a 'standard' file format.

    XML doesn't specify anything useful, as opposed to HTML which (mostly) specifies the meaning of it's tags.

    i.e. When i save an HTML document with a tag, the meaning of the content within that tag can be inferred, based on the HTML 4.0 spec, to be the title of the document.

    You can infer precisely nothing about the meaning of an XML document's tag.

    Of course, XML has the ability to use *any* schema, from which the meaning of content can be inferred, but since the schemas follow no standard, youre back to square one.

  • ah fuck the tags in my post got munged.
  • Rational programming/fuzzy logic/quantum software blah blah. Its all still rule based programming created in advance of its use by programmers.

    ...yes and some of it is scalable to absorb the knowledge published on the web and some of it isn't.

    The approach Lenat et al have taken with Cyc is a good example of the failure of the "logic" approach to actually become intelligent enough to start gobbling up the web. TBL's semantic web has the same problem.

    Fuzzy logic fails to adequately deal with derivation of probabilities from statistics from first principles.

    Quantum software is too general -- presuming combinatoric systems are soluble which simply aren't with mechanistic systems.

    Neural nets are a big field. Learning without supervision is necessary to "index" the web in the sense that people are targeting -- and feasible -- it just hasn't been approached in a way that makes sense in the sense meant by John McCarthy.

  • Pls start off with an example of your basic point, here, rational programming, so people can decide whether they should try and decipher the rest of a long, confusing post.

    Here's a counter-suggestion:

    Stop being an anonymous coward when proclaiming someone else's prose to be "confusing". You may or may not be the intended audience. The intended audience is language designers who are familiar with the background concepts of TBL's semantic web and are professional enough about it to do some homework if they buy into it. Frequently a good paper may take literally a month for someone versed in the field to read and actually comprehend. That is not the same as targeting language users.

    If you are a language user, not a designer, then there's not much to show you since the point of design is to create things to show you, and this is a design philosophy document rather than a design document. However, if it will make you feel better, here is an off-the-top-of-my-head example of how one might use the relation arithmetic philosophy in converting relational operations from their present form to a more arithmetic form -- with some room for alternate notations:

    // simple column composition
    address=street (city state andor zip)
    // means the same as
    address=street*(city*state+zip)
    // means the same as
    address=street&(city&state|zip)
    // means the same as
    address=street,(city&state|zip)
    // means the same as
    address=street,(city,state|zip)
    // means the same as
    street=address/(city state zip)
    // implying that
    street = address.street
    name=firstname lastname
    name = residence / address
    residing_with = ((name address)^2 - name.1==name.2 - address.1!=address.2)/address^2
    // means the same as
    residing_with = ((name address)^2 - name.1==name.2 - address.1!=address.2).name^2
    // means the same as
    residing_with = ((name address)^2 - name.1==name.2 - address.1!=address.2).(name name)
    // means the same as
    residing_with = ((name address)^2 - name.1==name.2 - address.1!=address.2).(name.1 name.2)
    name.1 = residing_with/name.2
    // means the same as
    name.2 = residing_with/name.1

    Some salient features of this, admittedly limited and ad hoc, example are:

    Duplicate row counts are preserved through arithmetic operations.

    Column names start to behave like engineering units.

    Factoring out data via division is as natural as combining data via cross-product.

    The arithmetic rules need not be programmed -- they may be statistical inferences.

    Addition and subtraction are like insert and delete except that duplicates are counted and redundant deletes accumulate as negative row counts.

    What happens once statistical rules like this are asserted or inferred is that symbols like "address" or "residing_with" can be used within scalar contexts and their meaning will behave in a Monte Carlo fashion, selecting, usually with replacement, from the distribution of values in the relation represented by the symbol. If it doesn't work, no biggie, just do another Monte Carlo run or add another rule to change the distribution to be more representative of the perspective of the user.

    There are other dimensions to this, alluded to in the "statistical inference" item above, that are a direct result of using pattern matching to detect "ontology" confusion and unify terms that may appear to be different "as a statistical rule" within certain constraints. This gets around a lot of the "ontology" noise and lets systems detect/suggest translation routes between domains.

    The example above doesn't get into negative numbers very deeply except to show how they can be used to create a "select where" type statement by subtracting out rows that don't fit the criteria. Negative numbers also come up with credibility. Credibility comes into play when one of the columns is the speaker, or object, making the assertion represented by the rest of the row. In that case, there may be a lot of conflicting negations, thereby bifurcating, trifurcating, etc. the "logic" space into different points of view -- each of which may be internally consistent but which are in direct opposition with one another. This is similar to the existence of negative as well as positive correlation coefficients in regression. Depending on how you "identify" youself, you may experience a positive number of certain dimension while someone else of an opposing identity experiences a negative number of the same dimension.

  • by Baldrson (78598) on Wednesday April 11, 2001 @12:58AM (#299626) Homepage Journal
    As I posted to Slashdot a year ago on the topic [geocities.com]:

    The future of the Internet is in what I call "rational programming" derived from a revival of Bertrand Russell [gac.edu]'s Relation Arithmetic [geocities.com]. Rational programming is a classically applicable branch of relation arithmetic's sub theory of quantum software (as opposed to the hardware-oriented technology of quantum computing [rdrop.com]). By classically applicable I mean it is applies to conventional computing systems -- not just quantum information systems. Rational programming will subsume what Tim Berners Lee [ruku.com] calls the semantic web [w3.org]. The basic problem Tim (and just about everyone back through Bertrand Russell) fails to perceive is that logic is irrational. John McCarthy's [stanford.edu] signature line says it all about this kind of approach: "He who refuses to do arithmetic is doomed to talk nonsense." More on this a bit later, but first some history, because he who fails to learn from history is doomed to repeat its nonsense:

    When I invented the precursor to Postscript [berkeley.edu] (an audacious claim that I can back up [geocities.com] -- it started as a replacement for NAPLPS [ucl.co.jp] which I proposed while Manager of Interactive Architectures for Viewdata Corp of America [uiuc.edu] back in November of 1981 -- the Xerox PARC [rwth-aachen.de] guys found my approach of what they called a "tokenized Forth" communication protocol to be an intriguing way to encode text and graphics), I was interested in having a Forth virtual machine [cmu.edu] migrate into silicon (ala Novix [bournemouth.ac.uk]) so it could evolve from mere graphics rendering into a distributed Smalltalk VM environment (ala Squeak [squeak.org]) as videotex terminal/personal computer capacities increased. But I was _not_ interested in object-oriented programming as the long-term semantics of distributed programming environments. (I still have some of the hardcopy of the communiques with Xerox PARC and others from this period.)

    Rather, relational semantics were what I saw as the ultimate direction for distributed programming. I had a bit of a go at Tony Hoare [ox.ac.uk]'s "communicating sequential processes [bton.ac.uk]" paradigm and its Transputer [loyola.edu] realization because he was, at least, starting with the hard problem of parallelism rather than making like the drunk looking for his keys under the light post the way everyone else seemed to be doing (and still are, save for Mozart [mozart-oz.org], since threads, etc. are always an afterthought). But, because there were other hard problems like abstraction, transactions and persistence that he ignored, I christened his approach "Occam's Chainsaw Massacre [davis.ca.us]" in my communiques (in honor of his distributed programming language "Occam [bton.ac.uk]") and dropped it in favor of relational programming, which has inherent parallelism resulting from both dependency and indeterminacy. (BTW: Dr. Hoare seems to have finally come to his senses [ox.ac.uk] about this issue.)

    Unfortunately, the only researcher doing hardcore work on relational programming (meaning, getting to the root of relational semantics in a way that Codd had failed to do) at the time was Bruce MacLennan [utk.edu], then, of The Naval Postgraduate School [navy.mil], and he just didn't have the glamour of Alan Kay [sheridanc.on.ca] at places like Xerox PARC to attract the attention of guys like Steve Jobs [apple.com]. Bruce had a bit of a blind-spot, too, when it came to transactions and persistence, which I attempted to remedy by bringing David P. Reed [reed.com]'s work on distributed transactions for the ARPAnet to him, but although he wrote a white paper on a predicate calculus (close to a relational) implementation of Reed's thesis (MIT/LCS/TR-205), he didn't really "get it", IMHO. Reed and MacLennan abandoned their work for other pursuits (ironically, Reed was chief scientist at Lotus while Notes [lotus.com] was being developed but did not contribute his ideas on distributed synchronization to that development despite the fact that we had a mutual acquaintance from my Plato [thinkofit.com] days by the name of Ray Ozzie [cio.com] -- so, I share some of the blame for this failure) even as Steve Jobs botched the embryonic object oriented world by abandoning Smalltalk and giving us, instead, a lineage consisting of Object Pascal on the Lisa/Mac [mactech.com] which begat Objective C on Jobs's NeXT [bton.ac.uk] which begat Java at Sun via Naughton and Gosling's experience with NeXT. [umd.edu]

    This brings us to the present -- a world in which Javascript [netscape.com]-based technologies like Tibet [technicalpursuit.com] promise to not only salvage the object oriented aspect of the Internet from the birth defects of Jobs's spawn, but actually provide an advance over Smalltalk in the same lineage as CLOS [neu.edu] and Self [sunlabs.com]. But it is also a world in which there is growing confusion over the proper role of "metadata [w3.org]" in the form of XML [w3.org] -- particularly when it comes to speech acts [ohiou.edu] and distributed inference [w3.org]. I would call Tibet "the next major Internet advance" except for the fact that the basic idea for a Tibet-like system has been around and well understood since the early 1980's. When it is finally released, Tibet (or a system like it) will put the Internet back on track. I call that a "recovery", not an "advance".

    We are now poised to move forward with type inference [sun.com] based on full blown inference engines, thereby dispensing with the nonterminating arguments over statically vs dynamically typed languages that allowed Steve Jobs's spawn to get its nose in the tent. If you want to declare a "type" in a declarative language, just make another declaration and let the inference engine figure out what it can do with that information prior to run time. See how easy that was? Well, there is more to it than that, but not that much: Assertions have implications and assertions made prior to run time have implications prior to run time. Live with it and don't repeat the mistakes of the past.

    The confusion over semantic webs, and the reason Berners Lee et al will fail, is essentially the same as the confusion that has beleaguered all inferential systems such as logic programming and "artificial intelligence" over the years: logic is irrational and the real world demands rationality -- otherwise nothing makes sense. By "rationality" I mean that reasoning must literally incorporate "ratios" -- or, as John McCarthy would put it, doing arithmetic so things make sense. By making sense, I mean there is a sense in which one interprets the sea of assertions that clearly dominates for a particular purpose. With logic not only are you limited to 0 and 1 as effective quantities; you have no adequate theoretic basis from which to derive more accurate quantities with which to make sense by taking ratios and determining which inferences are dominant.

    Fuzzy logic and expert systems incorporating probabilities have typically failed because they are not based in the first principles of probability and statistics. As Gauss, the premiere probability theorist put it, "Mathematics is the study of relations." He didn't say, "Mathematics is the study of multisets." There are good reasons that relational databases, and not set manipulation languages, have come to dominate business applications -- and Gauss was aware of these differences when he began to derive his laws of probability. Subsequent axiomatizations of mathematics based on set theory were similarly misguided and have led to the idea that "fuzzy sets" are the way to introduce rationality into programming. Rather than sets, relations are the foundation, not just of mathematics but of rationality in the same sense that Gauss realized when he derived his theory of probability from the study of relations.

    Rationality allows for judgment which is recognized as inherently fallible -- but which allows one to procede without exponentiating all possible paths of inference. Judgment also allows various identities to limit sharing of information to that needed -- thereby creating speech acts and a basis for rational measures of credibility associated with those identities. Since credit-rating is a degeneration of credibility, it should come as no shock that the invention of negative numbers, originating as they did with the Arabic invention of double entry account keeping, has its analog in something that might be called "logical debt" with which negative probabilities are associated.

    And now we have come to the "quantum" aspect of rational programming. It is precisely the "credibility debt" aspect of rational programming that corresponds, in mathematical detail, to the various equations of quantum mechanics and their negative probability amplitudes. (Von Neumann's quantum logic [hps.elte.hu] failed to properly incorporate logical debt which has led to much confusion.) Logical debt is important to distributed programming for the same reason debt is important to financial networks. Logical debt is a way of handling poor synchronization of information flow in the same way that financial debt is a way of handling poor synchronization of cash flow. As in any rational system, there are both limits to credit and limits to credibilty that influence one's judgments and actions, including speech acts.

    The object oriented folks may, in a sense, have the last laugh here because when we divide up inference into identities that engage in speech acts, we are reintroducing the notion of objects that hide information via exchange of speech act messages that can be thought of as "setters" (assertions) and "getters" (queries). However, I believe it is only fair to recognize that the excellent intuitions of Johan Dahl and Kristen Nygaard [pair.com] did need the added insights and rigor of philosophers like J. L. Austin and T. Etter.

  • That architecture will be used globally.

    They fucked up the first time with HTML.

    I'm not speaking about the work the software will do. I'm speaking about how it is likely to evolve.
  • It has an architecture for communication but not for actually linking conditions to automated actions.
  • How much do you want to bet that we'll start out with a useful system and then all of a sudden devices will blow up as different devices try to lower the volume on other manufacturers' commercials?

    Look at what happened to HTML.

    I don't even want to think what the equivalent BLINK or MARQUEE tag will be in this case.

    I think Strings.com has a better chance at this because they're contractors not a software company
    and they build tools as the customer requests.

    For any of these pipe dreams to work we need to get market share away from the generic producers and into the hands of those who work closely with customers.
  • Hey, as long as this crap on the Internet is finally meaningful to *something*...
  • by MattGWU (86623) on Wednesday April 11, 2001 @12:15AM (#299631)
    >>A new form of Web content that is meaningful to computers will unleash a revolution of new possibilities

    Good ideas, but I think we first need to make Web content that is meaningful to Humans before we start worrying about our Computers

    (Yeah, I know...not *that* kind of meaningful, but it had to be said, what with all the worthless drivel on the Internet and all)
  • Isn't this what MS are planing with .NET / Hailstorm??
  • What .NET is about.

    If you had access to some of the .NET "bluesky" videos presented at the PDC when .NET and VS.NET was more or less unveiled, you might think that this interview is practically a narration of one of the many examples of the "scenarios with a .NET connected world"

    XML works for this precisely because it is so semantically laden (bloated :). The glue to say "i'll explain my data if you explain yours" is one way to let you build "agents" easily.

    Utilizing .NET is one way to realize scenarios like this. There are others, if you dont happen to like MS :) However, chances are, .NET will be involved at some point. I would envision something along the following: It looks like SOAP/XML is set to be the shoulders that this stuff builds on. XML-RPC might be a relevant competitor, but chances are, SOAP vs XML-RPC will end up like "." and "alternic"..i.e. theres the standard and theres the alternative, and they sort of interoperate, but all the big money will be on only one of them.

    fwiw, i think "the web" was supposed to be more "user friendly" from the start.. UAs wouldn't ever display URLs to the user.. the user would use key words or "Whats Related" to navigate and the URLs would be stricly a protocol thing..just like no one in their right mind reads their email by manually constructing IMAP URLS (imap://blah.com/blasdf/23erxd/1234)

    Getting computers talking to each other openly and intelligently is an obvious next step to get to the utopia of computers that actually help you manage your life as opposed to help you ruin it :)
  • Ooh yes, H4XX the 0nt010g135 D00D ! All your semantic concept are belong to us.

    Nice idea 8-)

  • So maybe in the future, those agents will be everywhere, but first the "impedance mismatch" between the real world and its representation in the virtual world must be reduced..

    "Impedance Mismatch" is just what the SW is all about. SW tools are antenna tuners. They can't stop things having different impedances, but they know how to adjust the baluns to allow communication.

    If you want an easier life, and your modest needs are typical, then just go down to Semantic Electrode Hut and buy 50 BNC connectors off the shelf.

  • How about Solresol [polarnet.com] (Except that it sounds like a Hawkwind album) as an Interlingua [interlingua.com] ?

    Last week I heard an interesting talk on ontological transforms (I work in a Semantic Web research lab), comparing the views of the Platonic Interlingua approach and a neo-Wittgensteinian transform-based approach, presented by a chap who's PhD back in the days before The AI Winter had been on machine translations of languages.

    Put simply, the Platonists assume the existence of a "Language of Heaven" (probably Welsh), and if you transform the source language into that, then there's another transformation into the destination language. All possible translations for n languages can be done with a mere 2n pre-existing transforms.

    The transform-based approach says there is no single common conceptual root. You can regress by transformations (maybe transforming Southern Redneck into English, then translating to French and thus transforming to Creole), but there's always a point at which you must translate between ultimate distinct roots.

  • Please print the article out on a 9-pin dot printer and try reading it again. This isn't an article about presentation, despite SciAm's unusually horrid typography. I don't know why they did that, but it's nothing to do with the SW.

  • The internet has drifted away from being an information source

    No it hasn't, it's just hiding behind the flashing lights of the high profile sites and the banner ads.

    Even Las Vegas has a library somewhere.

  • No, it isn't.

    DTDs are dead - please use XML Schema instead, you'll have a much happier time.

    Secondly, RDF is a data model, not a serialisation against a DTD. It's about how to share a toolset for working with distributed graphs of data, not about a single file format. If you want to really grok RDF, think about its data model, not its XML serialisation.

    .NET isn't the Semantic Web. It's the same difference as watching the Discovery Channel, compared to researching in a good library. M$oft aren't trying to share the same (somewhat) P2P principles as the SW, they're just seeing it as a way to sell server and terminal based services. Home computers are dead, and they know they need to move their business into the future appliance and service-based world. The SW isn;t about this, it's about opening things up, not re-selling pre-packaged pap on a monthly subscription agreement.

    M$oft are also a long way behind on their understanding and their involvement with the researcher / developer communities for this work.

  • It has an architecture. What are you blathering about ?
  • Why is it that everyone these days is ranting about how XML will change the world, when SGML, despite being around for god-knows-how-long, has quietly done its job yet not attracted any of the hoopla it's XML cousin has,

    Because SGML blew it. The SGML experts wanted to stay as experts, so they developed a priesthood cult around it. Go and talk to them in comp.infosystems.w.a.h - they still have their heads firmly up their backsides. XML threw a simple toolset at everyone, and followed the bits that seemed to stick, dumping those that didn't. Slashdoterati should love this - it's Open Source triumphing over suitware.

    XML doesn't specify anything useful, as opposed to HTML which (mostly) specifies the meaning of it's tags.

    You need a better clue.

  • No. Bigstyle.

    Pay attention at the back.

  • Originally, HTML was supposed to be about content,

    HTML has never been about "content". At one point it was supposed to be about "content of web-like infosystem pages", but it never quite happened (thanks to <BLINK> et al.). If HTML really was "just about content", then it would have looked a lot more like DocBook.

    Even that's not enough. You can't have a generic content language unless you have a structural schema to define what today's content needs to be describing. You can do this in SGML, or you can do it in a simplified manner in XML, but you can;t do it with the fixed set of tags for any dialect of HTML.

  • TBL should be listened to for many reasons:

    • He's smart. Smart isn't a consumable, so he still has plenty left, even after the web.
    • People rarely invent one good idea. Most invent zero. Of those that invent one, they're generally good for a few more too.
    • It's an inherently good idea. Lots of equally smart people want to do it too.
    • Being the famous smart guru behind the web meant that he could give up his day job in McD's (or wherever it was that he worked) and spend the surplus time just thinking about Clever Stuff. He probably does a lot fewer project status review meetings than the rest of us.
    • Being smart is fun. Having other people realise this is even more fun. He can walk into any research lab in the world, and a queue of their smartest people will form, just to talk to him. This encourages invention.
    • He smells better than some other gurus I could mention 8-)
  • Funny thing is; the more he thinks, the luckier he gets...

  • We don't need standards for defining the information (that's EDI, or even XML), what we need instead are standards for how the information will be defined. It's much more useful (in a broad sense) to have a language for expressing semantics and ontologies than it is to have a published standard for "invoices" and "patient records".

    Take a look at DAML [daml.org] for more.

  • What .NET is about.

    Did you read the article ?

    Do you know anything about the SW, and M$oft's strategies ?

    M$oft are nowhere in the SW initiatives. They are taking a stand that is almost completely opposed to it. Their new Hailstorm [microsoft.com] strategy is centralist and schema-based. Rather than build a Semantic Web where anything can talk to anything, M$oft are trying the BizTalk approach; where they license your own rigid schema back to you, so that you can pay to access a centralised server and receive data in their prescribed formats. Stalin would have been envious.

    .NET has no relation to the Semantic Web. M$oft don't understand the first thing about it.

  • by dingbat_hp (98241) on Wednesday April 11, 2001 @04:14AM (#299648) Homepage

    Yes, it is RDF. There are many areas of the SW work where it's not clear what the final technology will be (notably the schema expression tools, such as RDF Schema vs. OIL or DAML or DAML+OIL), but RDF itself seems almost certain to be used - there's just nothing else offering itself as a competitor in that niche.

    Some clarifications: XML isn't RDF, and RDF isn't XML. RDF is fundamentally a data model, whereas XML is just a serialisation of a much simpler infoset model. As RDF doesn't have its own serialisation (how you write it down), then the convention has been that it's done in XML. You could serialise RDF into anything you like, but I've yet to see a non-XML one.

    XML Schema isn't the same as most other schema languages in this field. XML Schema is concerned about structure and operational matters, not about expressing semantics. XML Schema would be a very bad choice for expressing the semantics of the SW. It works OK for Ariba and XrML, because they're quite limited applications of discussion (an invoice is an invoice is an invoice). Even with MPEG-7, XML Schema has run out of steam and the MPEG group have had to invent their own schema expression language. Using XML Schema for bureaux like BizTalk is extremely limiting, and a bad move long-term.

    DTD are dead. Use XML Schema instead.

    RSS (the site-summary format used for Moreover [moreover.com] newsfeeds and to make the Slashboxen work) isn't RDF. It's expressed in RDF and defined in RDF Schema, but it's just one RDF application out of many.


  • .NET != SOAP
    .NET != XML

    XML is an open standard, many people used it before M$, and many people will continue to use it without going anywhere near .NET

    SOAP is an open standard, many people used it before M$, and many people will continue to use it without going anywhere near .NET

    When will people stop seeing anything vaguely to do with interoperability, agents, data discovery etc and shouting "hey - .NET does that!! mod me up big boy!".

    XML-RPC is not a competitor to SOAP, they are for different applications, and a single system could easily use both.

    Flame away (asbestos suit is on...).
  • Coming from the guy whose work spawned the WWW, this is some speculation worth taking seriously

    Coming from the guy who hopes to make money out of YACSSL (Yet Another Client Side Scripting Language), one might begin to think that the WWW was a fluke.

    Rich

  • Nice post; it should cause a few headaches for the mods though, as it wouldn't be quite so (+1, funny) if it weren't so (+1, insightful)...

    The article has some interesting points to make, but it does look as if someone has played buzzword bingo with a highlighter pen on it.
  • aaah. Thanks for that; it makes a bit more sense now. I just had visions of Dr. Evil making quote marks with his fingers on the orange words the first time.
  • This new tech sounds fun and all, but I'm not sure I want someone to be able to download all my medical information because the palm doing the requesting claims to be owned by a relative...

    And can you imagine the fun to be had with a html tag that alters the volume of peoples stereos? Lets just hope that as well as "shut everything up and listen to my advert" there is a tag for "wake up the sucker's neighbours!"
  • So he created the world wide web, big deal... I wrote a currency conversion program in BASIC one time.
  • ...I can't see the difference really. I know I'll get slammed for even mentioning M$, but the example scenario that given at the beginning of the article is virtually identical to the one that M$ showed at their Profesional Developer's Conference last year when they unveiled .NET
  • by table and chair (168765) on Wednesday April 11, 2001 @01:00AM (#299657)
    In the FUTURE random words in BLOCKS of text displayed ON THE web WILL be inexplicably highlighted IN A stylish PINK-ORANGE several point SIZES LARGER THAN the rest of the body text. This will come to be known as bernersing, and will BECOME a standard control in GUI web-design APPS, WITH options for frequency, DENSITY , and with the advent of the Semantic Web, relevance TO content (default for the latter = 0).

    THOUGH this destroys the FLOW of the TEXT by wrenching the READER'S eye about and causing IT to pause, rather than travel naturally FROM WORD to word, this typographic treatment WILL BE hailed as a BREAKTHROUGH in internet desig N and will unleash a revolution OF NEW possibilities.

  • If you don't know who Tim Berners-Lee is, go ask CowboyNeal.

    Did you ever notice how you never see Tim Berners-Lee and CowboyNeal in the same room together???

    Pretty freaky stuff...


    --
    "Chiswick! Fresh horses!"

  • Just wait, these systems will be so complex underneath that in order for they to all be usable and compatible, we'll have to have a frontend like that little Office Paperclip. In order for it to be consumable by the masses, it will have to have somewhere that you can enter a natural language question, either typed or spoken to. The Paperclip agent becomes more useful when it provides some sort of thing to speak to for simulating face to face interaction.

    Maybe that's what Linux needs...*ducking bricks*

  • This has no business being considered "web". This should be some protocol that enables caching at some front door of the site so that sites are not constantly overrun by info spiders crawling up and down them all day long to find this guy's phone number and hours of work and various other info. Definitely, there should be some standardized format for this, but it already sounds like something that is just a centralized database container for vcards.

    If things were better structured, this guy's info would be cached/replicated at some town/corporate web site in turn cached/replicated at some other geographical and/or corporate portal where you look things up as a directory. Kind of makes me think of LDAP.

    Anyway, this is the sort of thing that directory servers such as Bigfoot could provide with today's technology if only they could structure it.

  • The first thing that popped in my mind, using the current web in a way which computers can talk to each other to get relevant information, is rdf (the file-format most commonly used to get the headlines from news sites).

    Altough there's nothing revolutionary about using other sites' headlines, I can just start to imagine what other ideas people'll have that'll use existing structures to build a real (ie, more useful) information empire.

  • If you had access to some of the .NET "bluesky" videos presented at the PDC when .NET and VS.NET was more or less unveiled, you might think that this interview is practically a narration of one of the many examples of the "scenarios with a .NET connected world".

    Ok, so s/\.NET/web\ services/g. I saw one of those 'blue-sky' videos and was sufficiently insprired to start working with SOAP using ( I didn't have a .NET SDK available :( )Java and Apache SOAP [apache.org].

    Web services are much better than a cluster of HTTP POSTs for the same reason that Java servlets are better than CGI scripts (at least for large scale development). Just because MS is hyping web services in .NET doesn't mean web services are bad.


    ____________________________
    2*b || !(2*b) is a tautology


    ____________________________
    2*b || !(2*b) is a tautology

  • Actually, if you don't know him, it's because you've not read until the bottom of the page where it actually reads : "Berners-Lee is director of the World Wide Web Consortium (W3C)"
    --
  • Originally, HTML was supposed to be about content, and information was supposed to be organized structurally. Then people lost sight of this in the competition to make the coolest-looking web page, meaning lots of flashy stuff, applets that only run on one browser, and overbearing page layout control that completely messes up your page if your eyesight or monitor are not as perfect as the web designer's.

    The stock response to this observation, by those who love flashy stuff, has been, "well, that's how the web evolved; get over it." But now it seems if we want these intelligent agents, people will have to start making their web pages with content in a somewhat standard form again, for AI can only go so far in figuring out random content.

  • This brings us to the present -- a world in which Javascript-based technologies like Tibet promise to not only salvage the object oriented aspect of the Internet from the birth defects of Jobs's spawn, but actually provide an advance over Smalltalk in the same lineage as CLOS and Self. But it is also a world in which there is growing confusion over the proper role of "metadata" in the form of XML -- particularly when it comes to speech acts and distributed inference.

    Where is this confusion you refer to in the role of metadata? From my basic knowledge of XML and its place in life it seems like XML is quite clear about data, metadata and their relatives. OTH from my brief encounter with the CLOS examples you linked to, it seems like CLOS does mix representation, data and actions on the data (as a programming language would do) something XML does not do. After all one of the design goals of XML was to seperate layers of metadata, data and presentation.
    I fail to see the relation that XML has with speech acts, or the confusion inherent in MetaLog. Actually Metalog seems like a pretty good idea and seems to compliment the idea of the semantic web quite nicely. In general I think that whether the semantic-web is the next big thing or not, it is still a big move towards a more data and user-friendly web. Search engines have tried their best (despite what is claimed in the original article most of the web is NOT indexed in SEs) but are not enough both in their indexing and retrieval capabilities. If the semantic web does indeed solve at least in part the polysemy problem by giving agents (or indexing robots) some kind of context, it would be a great help.

    limbo.
  • I think it's not just 'open standards' that allow for this access; it's a framework in which multiple standards, perhaps slightly conflicting standards, can coexist.

    I think that regardless of your computer ontological religion, it is much more likely and feasible to have the sort of information exchange that TB-L describes, than to have an MS mega-farm that holds everything for everybody. These are the same reasons that client-server became popular after the days of coherent, controlled mainframe-terminal systems--distribution and localization of computing and other resources makes many tasks much easier.

  • This reminds me of a very similar discussion [kuro5hin.org] over at K5 [kuro5hin.org].
  • Dr. Doug Lenat's Cycorp [cyc.com] released a press release [cyc.com] a couple of days ago indicating that they were going to release a new version of their knowledge-base and inference engine called OpenCyc in July 2001. They say that this is to support Tim Berners-Lee's Semantic Web. They also say that they are going to join the W3C.

    I find it strange that this Scientific American article talks about the Semantic Web, AI, Agents and Ontologies and yet makes no mention whatsoever of the largest knowledge-base of commonsense information - Cyc.

    The Cyc knowledge base was founded by Dr. Lenat in 1984 at the government funded consortium that was located in Texas to help U.S. companies compete with Japan. Lenat and his colleagues have spent the last 17 years creating a vast store of knowledge. I haven't really heard much of what has happened to Cycorp in the industry and have been expecting it to make it into products at some point so I was really happy to see the press release on Cyc.com but yet why didn't Tim at least give a passing mention to the ground-breaking research that Lenat has already done in this vein of AI?

  • The semantic web, as proposed, has both natural language and computer-oriented info for use by software agents (eg, <RELATION> tags). So natual language and the SW tags aren't in competition and your criticism therefore doesn't apply.

    You are correct that what SW tech is trying to do isn't easy. Getting disparate companies and countries to cooperate in a world wide web must not have looked easy to him (Tim B-L) either.

  • The SciAm article says that URL is a subset of URI.

    Example: the "local devices" the article talks about (eg, the stereo that needs to be lowered) would be referenced via URI and not the URL we're used to, i imagine.

  • Coming from the guy whose work spawned the WWW, this is some speculation worth taking seriously

    I found this proposition to be quite empty. No disrespect to Berners-Lee, but just because he did design something which has ultimately had great impact does not mean he is a visionary with any special insights as to where the world or communications is going in the future. My understanding is that the beginnings of the WWW were quite mundane and so were his initial aspirations. He is no more credentialed to opine as an authority on the future of communications than (say) Shawn Fanning would be to speak on the future of music distribution. Just because he had the good fortune of coming up with something which ended up becoming revolutionary does not mean he is any more likely to do so again in the future.

  • Clay Shirky wrote an excellent article on why intelligent agents (like the ones Berners-Lee is describing) are a dumb idea, regardless of the underlying infrastructure. The article is over a year old but still just as relevant as ever:

    -M

    You're smart; why haven't you learned Python yet? diveintopython.org [diveintopython.org]

  • Not to mention ontologies that have gone 404.

    From the article:

    The meaning of terms or XML codes used on a Web page can be defined by pointers from the page to an ontology.

    Great. Back at TB-L's example, what actually happened is that Lucy's handheld browser was running MSSemanticAgent 2.0 while the doctor's office's web page was running MSSemantic 1.1 which pointed to an ontology that was no longer available on MS's server so the prescribed treatment tag came up undefined. So she set up a search for providers by hand. Unfortunately, some of the provider lists were running OpenSourceSemantic so MSSemanticAgent 2.0 refused to recognize their tags.

    Lucy scrolled through the list, looking for familiar names, and tried to run them against Mom's health plan's server to find out if any were in-plan but the health plan's server must have been down because all she got back was connection refused.

    By trial and error, she found a couple of plausible-looking providers. She wanted the agent to find providers with a rating of excellent or very good but the only compatible rating site seemed to be slash-dotted so she tried to check appointment times. Unfortunately, the first provider's web page hadn't been updated recently -- it was offering appointments for some time last fall. They probably put up the appointment page and forgot about it so they never noticed that it got disconnected from their appointment book when they upgraded their scheduling application.

    When she found a provider that she thought might be in-plan that had a plausible-looking appointment schedule, she thought she would at least find out if it was within a 20-mile radius. The mapping site couldn't find the address, so all it could say was that the center of town was within a 20-mile radius. She decided that she had wasted enough time already, so she tried to have her agent send the search to Pete's agent, having complete trust in Pete's agent. At least it would have had complete trust, except that one or the other of them was running with an expired certificate (although Pete had accepted a forged Microsoft certificate earlier in the day by routinely clicking on OK when presented with a warning message).

    After trying unsuccessfully for some minutes to figure out from the cryptic error message what was wrong, Pete gave up and re-entered the search himself. His agent told him the appointment would work without rescheduling any less important appointments. Unfortunately, that was because the last time he had sync'ed his PDA with his web calendar, it had quietly failed to copy some appointments. Or perhaps it had put up an error message which he had reflexively dismissed because there was a blizzard of pointless warning messages and pop-up ads every time he accessed his web calendar.

    And so on and so on. Pete and Lucy would have been better off just to make a couple of phone calls, talk to a couple of humans and get the whole thing done.
  • "hypertext" at that time was theory - we were all "using hypertext" in 1990?

    Those of us forward-thinking enough to be running Windows or MacOS instead of some bassackwards Unix clone were happily using Hypertext in Windows Help and HyperCard, thank you very much.
    ---

  • Actually, the newer .CHM files are written in HTML even (instead of RTF as the old .HLP ones were).
    ---
  • Although a semantic web engine probably wouldn't make some of those assumptions.

    And wouldn't get the movie reference either, I suppose. <wry grin>

  • This is pretty interesting. It's pretty hard to track down information about people, especially if they don't have much of a Net presence, or you simply don't have enough starting data. I'd really like to be able to turn up interesting data hidden several layers deep, and be able to search for things with a vague query that involves all sorts of peripheral identifiers. For people, it might be high school attended and current company, maybe other people they might know.

    Hmm. Definitely interesting. Although I doubt it will be able to find my father's brother's nephew's cousin's former room-mate. ;)

    Oh, for the Semantic Web - I saw a couple of stories here before, but here's something that might be informative:

    Semantic Web Roadmap [w3.org] (http://www.w3.org/DesignIssues/Semantic.html [w3.org]) - also by Tim Berners-Lee, last modified 1998/10/14.

  • Agent systems again?

    I'll agree that the technology and need for such devices have grown since the days of Microsoft Bob (hint to all CEO's: never let your wife develop software). XML, and SOAP designed services, could definately make this work the way it is supposed to (and so often does not).

    There are, however, a few major caveats to this sytem working. First, every device you want has to have some sort of networking capibility. While I've seen alot of press and smoke over such things, I have yet to go down to the Wiz and select from a broad variety of networked consumer electronics. There is also the thought of rising prices of such interconnected goods, due to probably integrated wireless technology, unless everyone feels like rewiring their homes with new cable. You know, if bluetooth actually, like, worked properly, it could fill this gap.

    Next, hate to beat the old /. horse to death here, but what about security? We frequently feel violated when even broad aggregate data is collected about us. What are the implications of litteraly everything we do being sent over wire? Read Gibson's "Idoru" to get a sense of worst case situations.

  • Privacy concerns? Anyone? If this was presented by a large software company as a buisness proposal, I know for a fact we would all be furiously foaming at the mouths.
  • I WAS going to do exactly the same thing but you beat me to it!
    An interesting article ruined by some artistic editing. WHY ?!


  • by erikkemperman (252014) on Wednesday April 11, 2001 @03:16AM (#299681)

    So far I haven't read a post that addresses the other side of the matter: You might not even want to overcome the barrier between human and machine readable languages, at least not in some cases. I have some limited knowledge of work by the likes of Chomsky etc., as well as supposedly "culturally neutral" and "unambiguous" languages such as Loglan/Lojban. I feel most people, techies leading the pack, tend to forget that, often, the meaning of language can be effectively tweaked, stressed, or even negated precisely because it's ambiguous or culturally predisposed. Think of all the problems, for instance, that would arise if you want to teach a machine the meaning of sentences like or "Indian summer" or "Poetry in motion".

    In general, natural language is to me a wonderful "protocol" because it forces participants to make the effort of understanding each other's customs, ethics, interrests and interhumane sensitivities. Moreover, the natural language that people speak in some region always reflects that region to some extent, in terms of politics, history or even climate. I'm dutch for instance, but can you understand what I mean by "How a cow catches a rabbit" (which is a literal translation of a Dutch phrase -- guesses anybody?)

    The gurus and tech developers should throw the defacto standard philosophy "if it can be automated, automate!" out the window, and face the fact that, whether you like it or not, natural language is in fact a very powerful semantical framework, all in itself - it's "standardized" (vocabularies, dictionaries etc.) "backwards compatible" (languages mostly evolve quite organically) - its practice is just not so readily automated.

    regards, EK

    --
  • If you found TBL's article of interest, be sure to check out this amazing interview [vqfoundation.org] he sat down for. Tim fielded several issues such as this concept of the Semantic Web, Open Source and the Web Wars. A must see.
  • Sure this is going to take time, what doesn't? we have seen what a rush does with all the .coms crashing. But with divices such as Blackberry [blackberry.net] we are getting there. Before you ask no I'm not a user I have a Visor Edge.
  • The article has some interesting points to make, but it does look as if someone has played buzzword bingo with a highlighter pen on it.
    The highlighted words weren't buzzwords, read it again, the highlighted words were the portions of the exchange that a computer could understand and act on. None of the actual article was highlighted in that fashion, but the illustrative story was highlighted differently to show which parts of speech were useful instructions to a computer system.
  • Very interesting, but I can't help wondering if they're trying to do "top down" programming while "bottoms up" drunk.

    Dis Semantics Web [semanticweb.org] page on inference engines is suffering syntax.

    There'll be alot of "You know what I mean, right ?" , nose flicking, secret pig latin codes, lipograms, monkeys-at-a-typewriter, Voynich artwork, McLuhan collages, the Beatle's song "I Am The Walrus", or on the darker side, like Frankenstein's monster [foxnews.com]; something that lived without first consulting planned parenthood.
    Memetic epidemics.

    The RDF stuff is interesting.
    Imagine though the problems with semantic grammars.
    In Britain and many of its colonies, they drive on the left and sit on the right. In Japan, Turkey etc. they use Lukasiewicz's RPN form: (SOV) Subject-Object-Verb which can be very tricky for "someone who drives on the right side of the road" to get used to. Alot of semantic accidents will happen.

    Driving with diplomatic immunity.

    Just give me a passport,a flu shot
    and point me to the duty-free shop.

  • Rational programming/fuzzy logic/quantum software blah blah. Its all still rule based programming created in advance of its use by programmers. IMO the only real way forward for machine intelligence is neural networks which can adapt their "programming" to current conditions/experiences. Yes large scale networks are DAMN hard to set up properly and there are numerous huge obstacles to overcome (such as how do the neurons in the brain "know" what they should learn when presented with a particular stimulus) but ultimately they are the only way to imbue machines with real intelligence rather than just a very fast version of Chris Searles Chinese Room.
  • by Salieri (308060) on Wednesday April 11, 2001 @01:10AM (#299689)
    A separate, commerical common sense project is called CYC. It's been going since 1984 but is just starting to scratch the surface of the encyclopedia. It's incredible how dependencies can get you: for instance, you can't program what an aardvark is without also going into what it means to be a mammal, the geography of Africa, basic anatomical pieces, and behavioral traits -- all of which have their own dependencies, and so on. (Don't quote me on that though, "dependencies" probably isn't the right word.)

    Here [cyc.com] is the link to CYC, an interesting read about knowledge representation. It's also pretty timely, since they are about [cyc.com] to release some of the project after 17 years of development. Might make a good story.

    By the way, how many people posting now are from Australia? My sympathies for any other -500 students whose homework also kept 'em up tonight.

    --------------------------------
  • by Peridriga (308995) on Wednesday April 11, 2001 @12:25AM (#299690)
    I will be the first to say it.... I love technology... But, reading this makes me wonder what is enough...

    Alas, voice activated and personalized networks are going to aid in everyday life (especially with those physcially handicapped) but, it removes the most deveolped and complex form of communication... Human Interaction..

    This is becoming less and less a factor in the average humans life.. With business going paperless and friends going wireless when does someone really have to talk to someone... If you telecomute and email your family, do you really have to talk to anyone, besides maybe your coffee maker when you get up in the morning..

    I don't want to be a anti-technology advocate but, mearly express an idea that we are excluding the most needed facet of human life... Interaction...

    Prisoners are isolated for punishment... We are isolating ourselves for convience?..

    Well... my two cents.. yall can make as much change out of it as possible...

    --- My Karma is bigger than your...
    ------ This sentence no verb
  • I think some lateral thinking should be used here. Perhaps a different representation of the "Web", rather than trying to devolop solutions for the current "Web".
  • by Flying Headless Goku (411378) on Wednesday April 11, 2001 @12:26AM (#299694) Homepage
    The human-readable and computer-readable stuff, that is.

    How? Lojban [lojban.org], a constructed language designed to be absolutely consistent and logical. You might know it in its earlier incarnation of Loglan, which was mentioned in passing as a language used for conversing with computers in Heinlein's "The Moon is a Harsh Mistress."

    Certainly, you could structure a valid Lojban statement to be unreadable to computers, but it isn't that way by default. If you state things directly, the computer can extract useful information.

    This is why I'm absolutely 100% certain that we'll all learn Lojban soon. Yup, there is no doubt in my mind. None at all...
    [rolls eyes,whistles a little tune]
    --
  • The entertainment system was belting out "Put 'Em on the Glass" when the phone rang. When Pete answered, his phone turned the sound down by sending a message to all the other local devices that had a volume control. His mistress, Lucy, was on the line from the office: "I think we need to see a specialist and then have a series of physical sessions. Bi or something. I'm going to have my agent set up the appointments." Pete immediately agreed to pay the fees, after confirming that she meant a chick.

    At her "advisor"'s office, Lucy instructed her Semantic Web agent through her vibrowser. The agent promptly retrieved information about the "treatment" from her advisor's agent, looked up several lists of providers, and checked for the ones within budget and a 20-mile radius of her home and with a rating of triple-H (Hot, Horny, and Healthy) on trusted rating services. It then began trying to find a match between available appointment times (supplied by the agents of individual providers through their Web sites) and Pete's and Lucy's busy schedules.

    In a few minutes the agent presented them with a plan. Pete didn't like it. The university student housing was all the way across town from Lucy's place, and he'd be driving back in the middle of rush hour. He set his own agent to redo the search with stricter preferences about location and time. Lucy's agent, having complete trust in Pete's agent in the context of the present task, automatically assisted by supplying access certificates and shortcuts to the data it had already sorted through.

    Almost instantly the new plan was presented: a much closer brothel and earlier times--but there were two warning notes. First, Pete would have to reschedule a couple of his less important appointments. He checked what they were--not a problem. The other was something about his STD checker's list failing to include this provider: "Non-contagiousness securely verified by other means," the agent reassured him. "(Details?)"

    Lucy registered her assent at about the same moment Pete was muttering, "Spare me the details," and it was all set. (Of course, Pete couldn't resist the kinky details and later that night had his agent explain how it had found that provider even though it wasn't on the proper list.)
    --

Those who can, do; those who can't, simulate.

Working...