Want to read Slashdot from your mobile device? Point it at m.slashdot.org and keep reading!

 



Forgot your password?
typodupeerror
×
The Internet

Tim Berners-Lee's List 42

weink writes "Tim Berners-Lee has made a career out of resolving Internet pet peeves. Ten years after he invented the Web, making the Internet user friendly, he is still drafting lists of things that could work better. "
This discussion has been archived. No new comments can be posted.

Tim Berners-Lee's List

Comments Filter:
  • TBL applied hypertext to the internet and invented the Web at CERN. He did actually invent it. He didn't invent the internet, that was DoD.
  • If babelfish can read a page and map it to another language, it's reasonable enough to expect that the relevant topics in a page can be extracted and mapped to some kind of categorization.

    Babelfish just has to look up words and phrases in a dictionary and replace them with their defined equivalents -- that's why it's not as good as a human translator.

    Picking up the relevant topics in a page is a good deal harder, since it seems to require some degree of comprehension. I can't imagine a bot being able to distinguish a page about how much Bill Gates sucks from a page about how much pages about how Bill Gates sucks suck -- or even easier stuff.

    The mapping is easy; the extraction is hard.

  • by AcMe ( 715 )
    err.... web even. spelling like crap will get me no where...
  • I really get annoyed with the emphasis on e-commerce that Berners-Lee and his ilk have had. Everything I see coming out of the "web innovation" stuff is new ways to get people to spend money. Security for e-commerce, IDs for e-commerce, searching for e-commerce... how terribly dull.

    It makes it hard to argue against the anti-Internet/anti-technology people when so many people arguing for the Internet see it as nothing more than a way to sell products. People who read about the Internet from the mass media have every reason to be scared of it -- it sounds horrible from that perspective.

  • Gosh, I almost had a seizure with all those flashing, blinky things! Talk about a Las Vegas-style webpage.
  • Sorry for the confusion. Just to clear things up: I've been around a while and remember gopher and UUCP. I didn't think the internet was the web; I just got a little confused about what exactly Al Gore invented.

    Again, I apologize for any confusion I may have caused.

  • You misspelled "Al Gore."
  • The chief trouble with trusting metadata is that page owners and maintainers who are paid by advertisers for eyeballs will have every reason to label their pages with false metadata in order to attract clicks.

    Consider the problem that search engines faced when they indexed solely on the basis of textual relevance: pr0n sites filled their pages with the same words, repeated over and over again: "teen sex xxx porn pictures teen lesbian sex erotic sex xxx porn porn xxx sex teen girl babe sex sex xxx" and so forth. This made their pages more likely to turn up at the top of a search, and thus garnered more eyeballs for their advertisers. Who suffered? Teen-age lesbians (etc.) looking for informative sites about issues related to their lives, not for hetero-oriented pr0n.

    Metadata systems are just as exploitable. Anyone familiar with the Prisoner's Dilemma will recognize the following --- because these systems (like pure textual relevance search systems) reward "defecting" behaviors such as deliberately false labeling, they will not solve the problems that result therefrom.

    Even notwithstanding the problem of dishonest behavior, there remains the problem of clueless or simply self-aggrandizing behavior: users labeling their pages as more relevant to a given topic than they really are, or not understanding distinctions among topics. A marketer at Dell might not know what "computer science" is, and insist that "computer science" be added to the metadata of Dell's e-commerce site. "After all, we sell very scientifically-designed computers. Isn't that what computer science means?" Cluelessness reigns supreme.

    Until these problems can be solved, human-indexed sites like yahoo.com and dmoz.org will have some huge advantages over spider-powered search engines.
  • I was kinda disappointed on the low detail of that story.

    The topic of P3P alone should be enough to start whole flamebaits on privacy issues.

    Then again the ICe seems more like a Pointy-Haired Expo anyway.

    "We mean eBusiness" - why does that remind me of Dilbert? ;)
  • I'm currently looking into the alphabet soup of standards coming out of the W3C, trying to decide which ones are useful and how they might be applied to free software and Gnome in particular.

    There's a lot of interesting things out there. In particular, I think XML and DOM could be the basis for a very good component framework in which powerful components would be easy to write, and would integrate nicely without a lot of hassle. I'm looking at RDF as a piece of this.

    But, as far as I can tell, the problem that RDF solves is a bit different than the one mentioned in this article. RDF is a way of representing documents as graph structures, allowing individual files to contain both local and external pieces without everything getting tangled up.

    The problem of representing metadata unambiguously is a tricky one, but is not yet solved. The RDF spec presents an interesting outline about how this might be done, but it doesn't quite tell me what I need to do to get my own Web pages to be correctly meta'ed. If I were a library, then the Dublin Core [purl.org] would start to give me the specific markup I needed, but that's just for libraries. What do I use do as metadata for my free software efforts?

    It seems like the combination of XML plus XML-NameSpaces plus Dublin Core plus all the other recommendations, specifications, and standards analogous to the Dublin Core but for domains other than libraries might cohere into a workable metadata system for the Web, but on the other hand, the complexity and fuzziness of specification could very easily prevent the beast from flying.

    When you're dealing with software, precise specification is key. Some metadata standards have succeeded pretty well in this regard - take MIME content types, for example. If you have a JPEG image, you know that the content type should be "image/jpeg". But the XML crew hasn't even managed a consistent namespace name for HTML 4.0 (I've seen "urn:w3-org-ns:HTML", "http://www.w3.org/TR/REC-html40" and others).

    For those hoping for a more technical discussion of RDF, I recommend the Mozilla page on RDF [mozilla.org] and of course the specification itself [w3.org].

  • Uh, I think rdf is xml.

    It is supported in Mozilla, but not Internet Explorer 5.
  • I think metadata is just not the best way to classify content. It's extra work to produce it, and people never agree on where something should go. The end result is like usenet - the honest people try to keep on-topic, while the spammers bombard them with off-topic crap.

    What I would suggest is automated classification. There has been plenty of work over the years in AI and related technologies for parsing, digesting, and classifying raw, unmarked-up text. If babelfish can read a page and map it to another language, it's reasonable enough to expect that the relevant topics in a page can be extracted and mapped to some kind of categorization. There are a number of companies out there selling the technology to do this right now. It's not perfect, but with human editing it can put together a web directory very quickly.

    Ultimately I envision being able to start from a page anywhere on the web, push a button, and get a list of other sites on the same topic. If it were done right it would beat keyword search by a large margin.

  • For example, at about 5 or 6 am EST I hit reload but this story was nowhere to be seen, yet there's a comment from a few minutes after it was put on the main page.



  • by dmeiz ( 9373 )
    rdf' that. sgml and its derivatives (i.e., xml) are well on their way to fulfilling those very requirements. the last thing we need is a new standard.
  • You should know that XML (and SGML) are only standards for creating markup tag languages. RDF is an XML compliant markup language.

  • by Cassius ( 9481 ) on Friday March 26, 1999 @01:50AM (#1962306)
    Looking at the quality of 90% of the web pages out there, I think it is probably unrealistic that people will being applying RDF in an intelligent way.

    In fact, using RDF in a fractured or improper way may even be more detrimental than good 'ol heuristics. Malformed RDF will send syntactically correct, but semantically incorrect metadata to a search engine equipped to handle it. This is a dnagerous combination - it makes bad search results more precisely wrong. I'd rather that have good guess than a precisely wrong answer.

    It ultimately boils down to whether you trust users to be able to describe their own metadata. I don't. Perhaps a good apporach is to have centralized servers attempt to create correct RDF files based on a set of common criteria. While this is still a flawed approach, I would rather have search results that are consistent (consistently wrong or consistently right) than try to get inside the psychology of each individuals web designer's implementation of RDF metadata. This approach might also cut down on metadata abuse (trying to bump up page in searches where it should not rank highly, etc).

    In other words, I think we're way way off solving the metadata/search issue. For now, the best answer seems to be human categorization (yahoo) or smart smart heuristics (google, inktomi).
  • fuck al gore. the internet is INTERnational.
    - NeuralAbyss

    ~^~~~^~~~^~~~^~~~^~~~~^^^~~~~~~~~~~~~~~~
    Real programmers don't comment their code.

  • Al Gore invented the Internet, Time Bernere-Lee invented the web.

    Do you newbies know nothing?

    Samael
  • Are you promoting this "only the little people lie and we should only trust CorpGov LLC" or are you lamenting the fact? Pesonally i like to read post by ordinary people who then list the url for an interesting site. I only use search engines and portals when i'm doing my own research. Another way to find sites you like is to subscribe to online newsletters in the subjects you are interested in be they hobby or work or other. I consider slashdot as one of those places on the web where intelligent people are exchanging intelligent information and urls.I like for one. I'm not really sure the big money portals really care about our needs, we are consumers to them, they use "content" merely as a identifier of what advertiser would be interested in selling us. Sometimes i just want to read and learn without wondering if i'm gonna receive an ad because i happened to look a certain webpage. Or those banners that pop up with Amazon.com telling you to go to their site for more information about (fill in your search subject). Do they believe that using technology that puts my name on a web page or email is the kind of personalization the net needs to offer? Is this the future of the web?
  • Considering his background, it is understandable that the press would want to listen to Tim Berners-Lee rant about RDF and how it might change all search engines.

    I'm not about to undermine RDF, but it should be noted that RDF is an application of XML and there are many other languages based on XML that could have an equal or even greater importance. Dozens of such languages already exist, and they include

    • WML for wireless communications
    • MathML for mathematicians
    • CML for chemists

    I'm looking at RDF as a part of XML, which is the big thing - not any of the individual languages.

  • No, no, Al Gore created the "Internet", and Ted created the "Web"... ;-)
  • Well, I post what I found...not much I can do there, the thing that really disappointed me was there were no additional links to the story

    I guess they are in the business of "quick news" for the low attention spent people...

  • well, he really did kinda "created" the "web", for me there is a difference between internet and web. When Lee did the "web", it was with US defence, for military use. I don't think there were much of colleges and universities involvement at that point...
  • No apology needed... ;-)

    Just imaging Al Gore as Prez of US, that's going to be confusion...
  • by weink ( 18584 )
    well, it's making internet/web "free"? Ok, fine, it's just making the "story" free, taking up bandwidth, and driving me carzy reading the story...
    Yack!!
  • Hi! I'm al Gore and I invented the web. I mean the internet. Nono... I mean ethernet... no, um... oh hell. I used BackOrifice once.
    --
    - Sean
  • ...and click the little button at the bottom that turns all the blinky, flashy things off!
    --
    - Sean
  • by Billy_Pilgrim ( 20751 ) on Friday March 26, 1999 @09:19AM (#1962318)
    I'm sorry to say this, but I hate the idea of META tags on the web. I have been an avid net user since 93, just before the browser boom. When the browsers first came out, the search engines were still based on the text of a web page. Today, you are lucky to find such a tool. Most popular engines rely on the META tags, as if the general population had some kind of superior abstraction skills.

    At one time, you could force AltaVista to only show pages containing certain text or URL's. While those options are still accepted by the engine, they are largely ignored. As a user I am annoyed when I ask a search engine to only show me pages that actually contain certain strings, only to navigate to the page and turn up empty on a find.

    I have actually gone from 'portal' surfing (early Yahoo) to search-based surfing, back to 'portal' surfing. The web is hairy enough that I actually *do* want someone to filter out the crap for me, unless I am looking for something extremely specific and unhappy with the portal-based results.

    I do agree with some of his other thoughts, about form submission and URL changing. . .
  • People will and do mark up their sites in ways that they think will attract eyeballs to them, regardless of whether this has anything to do with the site's actual content. This is true by abundant empirical evidence.

    On the other hand, metatags make a lot of sense for huge commercial empires (Amazon, eBay, Buy.com, etc.) which will be willing to maintain reasonably accurate markings. I have a suspicion that in the not-so-distant future we will have a situation when the big search engines will accept (=believe) metatags from big commercial sites, but will ignore them from small fry. There may develop a "club" whose metatags Yahoo, AltaVista, Lycos, etc. will believe.
  • Sorrry, that mantra doesn't hold water in this discussion. Tim Berners-Lee is single-handedly responsible for the creation of the World Wide Web. While working for CERN he invisioned an Internet service/protocol based on a worldwide hierarchy of hyperlinked pages (the concept of which was around before he came up with it) and went on to CREATE the HTML specification, the HTTP protocol, and the Universal Resource Locator (URL). He also wrote the first web browser, and in NeXTStep at that. Therefore he did invent the World Wide Web. Don't believe me? Ask the W3 [w3.org].

    Will
  • Did anybody else find it amusing to hear that the Net is too complicated, when the story was surrounded by the incredibly obnoxious over-designed 'how many ads can we fit in a window?' PC World 'interface'?

  • rdf seems like the result of a bunch of frustrated ai experts... (and google rocks)
  • xml is one possible syntax for rdf. as is html - i was talking to an author of dublin core who showed me a draft of using dc in html.
  • don't want to sound like a smatpants, but the web was invented at CERN, which is the European Center for Nuclear Research, and has nothing to do with US defence. It was ArpaNet (aka internet) that was financed with US defence money, but it _was_ developed by univesities!

You knew the job was dangerous when you took it, Fred. -- Superchicken

Working...