Please create an account to participate in the Slashdot moderation system

 



Forgot your password?
typodupeerror
×
The Internet

Mr Anti-Google 531

MrNovember writes "Salon is running a story on some guy named Daniel Brandt who they call "Mr. Anti-Google." Mr. Brandt runs a sort of anti-establishment database of citations called NameBase as well as Google Watch. He claims that Google's PageRank system is undemocratic primarily because it doesn't rank his NameBase information very highly. He also points out that Google maintains a log of all you've ever searched for associated with a long-term cookie. Google's system seems to work the best if you ask me but, on the other hand, link popularity may not provide the most intelligent top rankings."
This discussion has been archived. No new comments can be posted.

Mr Anti-Google

Comments Filter:
  • better? (Score:2, Informative)

    by pixitha ( 589341 )
    isn't the system that google uses better than the pay system Yahoo does? Yahoo searches have been coming up with some really whacked results, that are totally wrong (ie whoever payed more...) just my $.02
  • Article (Score:2, Informative)

    by djrogers ( 153854 )
    Don't know why that was posted without a LINK TO THE FREAKIN' Artcle, but..

    http://www.salon.com/tech/feature/2002/08/29/goo gl e_watch/index.html
  • by Anonymous Coward
    IF people decide they don't like Google they'll search somewhere else. We vote by our searches...
    • Hey, didn't you read the article? The anti-google guy is an old 60s-style anti-establishment leftist. If you don't agree with him, you're a FASCIST, MAN! You're just HELPING THE SYSTEM keep us down! Google is undemocratic just like CAPITALISM is undemocratic! It's shameful that the top search result on google doesn't list his site higher, and something MUST BE DONE!

      To Salon's discredit, they appear to agree with this idiot, evidently finding some sort of nostalgia in his fact-averse crusade.

  • Thanks, I already know how to get to Google and Salon. What I don't know is how to find the Salon article, especially after it scrolls off of Salon's front page.
  • by pgrote ( 68235 )
    Here is the link to the story:

    Slon Article [salon.com]
  • Oh, I see (Score:5, Funny)

    by override11 ( 516715 ) <cpeterson@gts.gaineycorp.com> on Thursday August 29, 2002 @11:07AM (#4163953) Homepage
    And I think the Us Monetary system is unfair because I dont have enough of it!!!
  • He also points out that Google maintains a log of all you've ever searched for associated with a long-term cookie.

    Good thing I search for p0rn with cookies, Java and JavaScript turned off! I also wipe my disk cache between sessions.
  • Reality Remains (Score:3, Insightful)

    by geogeek6_7 ( 566395 ) on Thursday August 29, 2002 @11:08AM (#4163961) Homepage
    Fine with me if he wants to complain, Google still remains my number one search engine, due to its highly relavent results. You can whine all you want, but that doesn't change reality. ~geogeek
  • by Anonymous Coward on Thursday August 29, 2002 @11:08AM (#4163962)
    a [namebase.org] few [namebase.org] links [namebase.org] will [namebase.org] fix [namebase.org] namebase's [namebase.org] spot [namebase.org] on [namebase.org] the [namebase.org] list [namebase.org]
  • It's not just link popularity.. where those links come from is also very important.

    If a popular site links to yours, that has more weight than some one-off site that links to yours.. google takes this into account.

    The guy can argue all he wants.. google does not pruport to have the best stuff at the top all the time.. but if this guy's site was so good, then more people would link to it, if more poeple linked to it, it would be more popular.
    • Re:Gee wow (Score:3, Insightful)

      by Kythorn ( 52358 )
      This system isn't without flaws though, particularly in an online community such as Slashdot itself.

      If for instance, some posted decided post a single link [copacetic.nu] to some obscure worthless website that nobody's ever heard of, let alone linked to in a comment such as this one, it will be ranked accordingly to the total of slashdot's calculated 'popularity' based weight.

      Google does not, and probably can not distinguish between actual content on this site and inane comments made by people such as myself.

      Is this a large flaw? I really can't say, and I certainly don't have a solution to propose. I still say google is the best thing out there, and beats the hell out of inktomi's paid listings, which power an ever growing number of search engines.
  • by CampbellXL ( 550295 ) on Thursday August 29, 2002 @11:08AM (#4163972)
    ...he'd increase his page ranking on google if he removed the little tin foil hats from his servers.
  • by Anonymous Coward
    The older your site is (and the better it is), the more likely that it will be linked to, and linked to well. If your site is new, small, or bad, very few people will link to you.

    Compared to the other search engines, Google is great, and that's what matters. Is it possible that someone could make a better search engine? Maybe. Please, try. Competition is good for everyone.
  • by greymond ( 539980 ) on Thursday August 29, 2002 @11:10AM (#4163989) Homepage Journal
    yeah I hate google because:

    1) When I type in my name IT DOESNT SHOW IT!

    2) My websites are not listed #1 NO MATTER WHAT YOU TYPE IN!

    3) There image search doesn't have PHOTOS OF ME!

    4) I hat all other search engines for THE SAME REASONS!

    wa wa wa......
  • His site [namebase.org] isn't loading for me. Guess I'll have to go Google's cache to - oh, wait a minute... it's not in there! How rare!
  • I just tried visiting google-watch.org, but it seems to be down ("document contains no data"). So I google [google.com] for it.

    Caching has been disabled for the site.
  • Google Cookies (Score:5, Insightful)

    by Compulawyer ( 318018 ) on Thursday August 29, 2002 @11:12AM (#4164004)
    I have Mozilla set to disallow cookies from Google and I've never noticed any difference in the quality of search results between searches with cookies permitted/denied. Even if it is true that Google tracks searches, at least it isn't REQUIRING cookies to be enabled before you can search.

    As for the point made that this guy thinks that Google is "undemocratic," give me a break! Google is not a government - it is a search site! They exist to make a profit. They will make money by providing a quality search result, thereby attracting users. They are not in the business of being the arbiter of democratic principles on the web.

    • Re:Google Cookies (Score:2, Insightful)

      by Telastyn ( 206146 )
      Actually, most search engines exist to make a profit by selling off the results to the highest bidder.

      Capitalism and Democracy are rarely congruent.
      • Re:Google Cookies (Score:3, Interesting)

        by Compulawyer ( 318018 )
        True, but search quality is measured by the closeness of the match between the search perfomed and the result of that search. If a high-ranked result is an excellent or even good match to the search performed (meaning that it is what the searcher wanted in the first place) then the fact that the high ranking was sold rather than generated by an algorithm that does not account for financial/business relationship factors is completely meaningless. The searcher got what s/he wanted - a quality result. The high-listed site got what it wanted - high placement in Google's result listing. Google got what it wanted too - payment for steering traffic to the site.

        As in this example search for snowboard retailers [google.com], Google even tags the top results as "Sponsored Links" so even the searchers know that those sites are ranked first because they paid Google to be ranked first. If it is what the searcher wanted, it doesn't matter.

        IMO, this is no different from a company purchasing a large ad in the yellow pages of the phone directory. Does/should anyone think that the ads are bigger for certain companies because those companies are better? People know that companies buy those ads. Searchers should also know that "sponsored" = paid. I don't see anything inappropriate. In fact, I credit Google for being above some of the slimy companies on the web and staying above the board with its business practices. Google's ability to charge for ranking will be nil if its search results reduce to the point of being roughly equivalent to random advertising.

    • Re:Google Cookies (Score:3, Insightful)

      by NanoGator ( 522640 )
      So... Google makes a convience feature of storing your search terms in a cookie (which people are all well aware of), and that's a security risk? Google was being sneaky?

      I think the reason this guy's being told to shut up is that everything he says sounds like propoganda. When he talks it reminds me of that "If you download MP3s, you're supporting COMMUNISM" ad I saw a while back.
    • Re:Google Cookies (Score:5, Insightful)

      by Kwil ( 53679 ) on Thursday August 29, 2002 @12:49PM (#4164781)
      From the article:

      More than that, says Brandt, Google is a careless custodian of private information. When you search for something at Google, it saves your search terms and associates them with a cookie that is set to live on your machine for 36 years. Brandt fears that law enforcement officials could muscle Google into divulging all the terms you've ever searched for. Those terms could be "a window into your state of mind," and are therefore a clear violation of your privacy, he says.

      Uh, Does Brandt even properly understand how Cookies work? If the Feds go to Google and say "Give us all the cookies you've stored on people's computers" Google is going to say "Uh.. see, that's the thing about storing them on other people's computers.. we don't store them here."

      And as for Google recording every search term I've searched for, let's be realistic here, even if Google did have that kind of storage space available (every term, for every user, with a link between each?) why in the heck would they use it for that when they have the whole freakin' 'net to try and store?
      • Re:Google Cookies (Score:4, Interesting)

        by afidel ( 530433 ) on Thursday August 29, 2002 @01:31PM (#4165122)
        Google probably does have the space, they have about a half dozen (probably more now) copies of their database at each of their colocation facilites, they do this for load balancing and redundancy purposes, by increasing their server farms by say 10-20% they could pretty easily store all of the searches they run every day in some compressed format.
      • Re:Google Cookies (Score:3, Insightful)

        by singularity ( 2031 )
        Think about this situation, though: You are under investigation for something, so the Feds nab your computer with a search warrant. They grab the cookie from your computer, and then go to Google with a sepeana for that information.

        I think that once you have a judge consent to a search, getting him/her to sign off on asking Google is a minor hassle.
    • Re:Google Cookies (Score:4, Insightful)

      by ergo98 ( 9391 ) on Thursday August 29, 2002 @01:00PM (#4164869) Homepage Journal

      Google is not a government - it is a search site! They exist to make a profit. They will make money by providing a quality search result, thereby attracting users.

      Google has become one of the most important gatekeepers on the net, and they literally can make or break businesses by playing with their database (I wonder if they have checks and balances to ensure that Google workers aren't doing favours, returned by some $, for people by tweaking their rankings). Your claims that they're just some business is about as valid as saying ICANN is just some business that can do what they want. Uh huh.

      In any case, the Salon article was pathetic. As much as I might disagree with this guy's opinion that Google sucks because it doesn't rank him highly, there is no doubt that we need to be vigilant that the net isn't usurped by any one group or individual. The Salon article did a classic right winger technique of refuted everyone of this claims with some absurd parallel claim: It's hard to get too upset about search privacy at Google when, all over the Web, other sites are increasingly playing fast and loose with private data....Yahoo, which requires sign-in for portal services, has already announced a plan to e-mail ads to people based on what they've searched for. (The plan, called Yahoo Impulse Mail, is "opt-in.") If you wanted to be a watchdog for the privacy of search, wouldn't you start by attacking that program?... Uh huh. "Well, sorry that the police raped and beat the kids walking down the street...but in Afghanistan they behead them too! Go pay attention to them, there's nothing to see here! {YOINK} (running away)". It's a pathetic, and dangerous, technique of disqualifying a complaint.

      And what's with the ridiculous Google-love on here? You'd think that every Slashdotter was a majority shareholder. Google is my search engine of choice, but when Doubleclick tracks what you do there's an outrage on Slashdot. When Google technically has the capability to pull up every search you've ever performed (errr "genital warts"), it's a non-issue? Uh huh.

  • by cpex ( 601202 ) <(ude.dscu) (ta) (anovivj)> on Thursday August 29, 2002 @11:12AM (#4164005)
    Google is a very good search engine. And I don't know what the hell this Mr. Anti-Google is talking about, "undemocratic" everyoney knows google is powered by pigeon clustetrs [google.com], millions of pigeons voting on the relevant sites
  • I've thought for a while that, although Google is undoubtedly a fine search engine, it does make it difficult to get on it in the first place.

    Since you need to have links to your site from other sites to get rated highly in Google, it is almost impossible to get them, as people who may be interested in linking to your site won't find it on Google.

    Vivious circle, anyone?

    Goblin
    • Not really true, you can just submit your site to google and it will visit it (well.. within a few months and as long as its not a web nightmare). Links help but they arent the only way to get in.
      Once in, decent content will mean you will eventully get links as people find you - even if you start on page 10 in the results to start with.
    • I think of Google like a mirror; it just reflects what it sees. If your site is good enough, people will link to it, regardless if it's listed on Google or not. You don't need Google to get your site viewed. How did we ever get popular websites before Google? Such things as word-of-mouth, IRC, etc. are ways people can spread their URLs. Google isn't the only medium out there to get your site seen, and my point is that you shouldn't be relying on Google, because it's just a mirror.
  • Google, if I recall correctly, ranks things not only based on quantity of links, but also based on who links to you. Thus the link from Slash will help him out a bit.

    although strangely enough, apparently so will links from subdomain sites like geocities, etc.

    so now he merely has to complain about his monthly bandwidth allotment getting used up, and his serving crashing due to /.

    He can't win

    • Funny how he disables his site when we link to him in attempt to find something interesting. I was curious and I might have linked to him. Guess he will never be google'd. And he wonders why...
  • It's funny how salon would focus on a total non-issue like this (for christ sake, just turn off cookies) but completely ignore things like Yahoo resetting everyones mail options to opt-in. I guess there wasn't some crank they could quote for that article.

    Fear not, they'll soon be gone...
  • It seems to me... (Score:4, Insightful)

    by bziman ( 223162 ) on Thursday August 29, 2002 @11:13AM (#4164018) Homepage Journal
    That if you don't like Google... then you shouldn't use Google. Duh. Why the holy crusade? If you think Altavista or hell, Netscape Search meets your needs, then use it. Why do people find it necessary to attack everything instead of being constructive. Humbug.

    -brian

    • It's not that he's using Google, but that other people using Google don't find his site.

      According to the article, his complaint is twofold: Google favors popular, established sites over young or unpopular sites. Also, he fears the cookie.

      I am Slashdot's complete lack of interest in his problems.
    • I Disagree (Score:4, Interesting)

      by FreeUser ( 11483 ) on Thursday August 29, 2002 @11:50AM (#4164338)
      That if you don't like Google... then you shouldn't use Google. Duh. Why the holy crusade? If you think Altavista or hell, Netscape Search meets your needs, then use it. Why do people find it necessary to attack everything instead of being constructive.

      I think, to be quite blunt, that this is a crock of shit.

      One of the most important things in a civil society are the checks and balances critcism offers on any service, any government, any individual, indeed, any endeavor undertaken. These checks and balances, and the importance of public criticism, because of vastly greater importance when the perceptions and lives of many people are impacted.

      This is true whether one is criticizing GNU, Linux, Richard Stallman, our corporate masters in the form of George Bush, Enron, WorldComm, Microsoft, Apple, Sun Microsystems, Red Hat, or whomever else happens to be in the hotseat at any given time.

      If Google really were stacking their search results, criticism and a 'holy crusade' as you so snidely put it, would be a very important counterbalance in offsetting the corruption and distortion inherent in such a thing, particularly given how trusted Google is.

      I disagree with the guys criticism, for what it is worth, and am an ardent user of Google. But I agree whole heartedly with the need for such criticism to keep the likes of Google honest, and to call them on the carpet when they do something shady or wrong (like they did when the caved to the Cult of Scientology's pressure to censor the search results revealing critics of that particular organization).

      This "if you don't have something nice to say, don't say anything at all" is a fine creed for slaves or submissive corporate drones, but it has no place at all in the marketplace of intellectual thought or debate.

      Now, on the other hand, if you'd like to argue for civil discourse instead of flame fests and random insults, I will be the first to add my voice to yours, but lest we forget, civil discourse can and must include criticism, sometimes vehement criticism. Indeed, such can often be the most important civil discourse being conducted.
  • by Todd Knarr ( 15451 ) on Thursday August 29, 2002 @11:13AM (#4164027) Homepage

    Google's PageRank system isn't supposed to be democratic. It's supposed to be effective. When he starts arguing that Google doesn't consistently return pages that meet the search criteria, then I'll listen to him.

    • by count0 ( 28810 ) on Thursday August 29, 2002 @11:36AM (#4164234)
      PageRank is a very 'democratic' algorithm. That's why it's effective. Democracy doesn't guarantee equal airtime...it guarantees that popular parties/candidates/platforms/web sites get *more* airtime. Democracy is not equitable in distributing power beyond 1 person/1 vote, and it isn't supposed to be.
    • Google's PageRank system isn't supposed to be democratic. It's supposed to be effective.

      Amen!

      In fact, any search engine that displays results relevant to your search terms won't be "democratic", because you'll get (wait for it) ... results relevant to your search..

      I mean duh.

      If I search for VPN configuration info or the GNP of Luxemborg in 1997 (I assume neither of which are on his site), and his site doesn't come up, is he going to complain about that too?
  • by nicklott ( 533496 ) on Thursday August 29, 2002 @11:14AM (#4164029)
    Brandt runs a sort of anti-establishment database of citations called NameBase as well as Google Watch.

    Not any more he doesn't...

  • The guy is also some whiny leftist who complains that if you search for "united airlines" Google actually has the nuts to present the "united airlines" website, and not some stupid protest site.

    Dumb dumb dumb dumb... but hell, what do you expect from something that ran in Salon anyway?
  • link popularity (Score:2, Insightful)

    by NixterAg ( 198468 )
    Google's system seems to work the best if you ask me but, on the other hand, link popularity may not provide the most intelligent top rankings.


    That's probably true from the standpoint that link popularity may not offer the most relevant information for what you might be looking for, but on the average, I'd say it's by far the best way to go. The main problem I see with that type of determination is that by having a link at the top of your search results, its likely that the site will continue to stay more popular than it might deserve. As a result, the system becomes somewhat self-defeating.
  • by jacobito ( 95519 ) on Thursday August 29, 2002 @11:15AM (#4164043) Homepage
    Namebase.org has been slashdotted now, but I had a look at it this morning, and I couldn't help noticing a tag similar to the following in the <head> of the index page:

    <meta name="Googlebot" content="noarchive">

    I can't help but think that this doesn't help his Google rankings.
  • Boo Hoo (Score:3, Redundant)

    by maggard ( 5579 ) <michael@michaelmaggard.com> on Thursday August 29, 2002 @11:17AM (#4164051) Homepage Journal
    Brandt thinks his material should be ranked higher because it's more relevant.

    To his agenda perhaps.

    However Google isn't used by most folks as a directory - it's a search engine. It simply pulls up entries according to a formula (see pigeonrank [slashdot.org] for the inside scoop) and gives those back. No bias beyond what smart webmasters can impart, no artificial clustering, etc.

    If Google were to start doing as Brandt wants it would quickly run into endless battles, loose it's searching edge, become just another pay(or agenda)-for-play roadkill.

    No thanks.

    • Re:Boo Hoo (Score:3, Interesting)

      by wfrp01 ( 82831 )
      However Google isn't used by most folks as a directory - it's a search engine. It simply pulls up entries according to a formula (see pigeonrank [slashdot.org] for the inside scoop) and gives those back.

      I agree, and I wish Mr. Brandt would suggest a workable alternative, rather than whining. He clearly has a monkey on his back.

      However, I do wonder about the efficacy of google's formula. My concern is that google's popularity turn it's page rankings into self-fulfilling prophesies. It's a positive feedback loop: a site w/ a high google rank gets more views and more links, which increases its google rank, ad infinitum.

      Like you say, I'd rather not have search engines be driven by agendas or money. But I believe anything can be improved upon. Personally, I believe perhaps a bit of randomness might help. Instead of recieving an absolute page rank, pages should recieve a probability of being listed higher or lower. Just a thought.
      • Re:Boo Hoo (Score:3, Interesting)

        by wfrp01 ( 82831 )
        While I'm at it...another idea. Google should publically state that if you put some particular meta tag in your document, that they will publish the contents of that tag (or tags?) in their page rank summary. This would encourage people to write good summary overviews of their pages, and would help users find things easier. With their clout, they could easily create a de-facto open meta-data standard. Use it or lose.

        As opposed to summaries that typically look like: ... Fri Dec 29 2000 Claudio Matsuoka : 5.49-39cl; put $CHKROOT
        inside ... fixes suport to "linux confirm"; make utmp group 22. Sat Aug 21 1999 ...
  • How popular can http://www.namebase.org/ be if it goes down before 30 comments have been posted?

    For crying out loud, my PERSONAL web site can handle more traffic than that.

    What's he hosting it on, a dialup?
  • I don't mind someone having their point of view, in fact I applaud Mr. Brandt for furthering what he beleives. However, search engine popularity is so flighty, if I think another engine is better than Google, I'll use it. Honestly, I have no ties to any search engine and feel I never will. However, Google has been able to stay at the top of the list (at least my list) for quite some time and has also managed to put the least amount of advertising (or harrassment) in my face. I used to used yahoo, until the pop-ups and ads overwhelmed me. I think much of Google's success came from the fact they never went public. This and the text based ads are incredible decisions when every other search engine was greedily grabbing web based advertising revenue. I like Google, I'll continue to use it, but I'm not going to fight for it either. Just my two cents.
  • by autopr0n ( 534291 ) on Thursday August 29, 2002 @11:18AM (#4164064) Homepage Journal
    Hrm, both namebase and googlewatch seem to be down. Is this just an innocent slashdoting?

    Or have the Google gods turned their clusters towards more sinister deeds, silencing their critics.

    We may never know.
  • by maggard ( 5579 ) <michael@michaelmaggard.com> on Thursday August 29, 2002 @11:20AM (#4164073) Homepage Journal
    Brandt thinks his material should be ranked higher because it's more relevant.

    To his agenda perhaps.

    However Google isn't used by most folks as a directory - it's a search engine. It simply pulls up entries according to a formula (see pigeonrank [google.com] for the inside scoop) and gives those back. No bias beyond what smart webmasters can impart, no artificial clustering, etc.

    If Google were to start doing as Brandt wants it would quickly run into endless battles, loose it's searching edge, become just another pay(or agenda)-for-play roadkill.

    No thanks.

  • by asv108 ( 141455 ) <asvNO@SPAMivoss.com> on Thursday August 29, 2002 @11:20AM (#4164080) Homepage Journal
    First Red Hat, now google, I guess when your on top you need to prepare for unsubstantiated criticism.

  • PageRank works. If your page is linked to by a large number of well trafficked sites, then you get ranked higher. If your some crack pot whose site no one cares about, you don't get a high rank...

    In other words, Brandt recognizes that there has to be some order to Google's results, and that some sites might deserve to come up before others. He just disagrees with the way Google does it. In Brandt's ideal world, if you searched for "United Airlines," you would see untied.com -- a site critical of United -- before you see United's page. And if you searched for Rumsfeld, you'd see NameBase's dossier on him before the Defense Department's site on the "The Honorable Donald Rumsfeld."

    Don't blame Google for equating accuracy and usefulness with popularity. It's either that or resort to subjective measures.
  • Sour grapes (Score:5, Insightful)

    by timholman ( 71886 ) on Thursday August 29, 2002 @11:21AM (#4164096)
    Upon reading the article, you find that Mr. Brandt's main complaint about Google is that he believes that when you type in, say, "Richard M. Nixon" into Google, the material he has compiled on Nixon should be ranked #1.

    Okay, so I did a search on Nixon on Brandt's site. Here are the first couple of results:

    (1) How the Vatican conspired to hide Nazi war criminals.
    (2) How various activists were persecuted by the CIA and FBI.

    Nowhere did I even SEE Nixon's name in these abstracts. The only relevance is that Nixon was alive at the time, or maybe president when some of them took place, but hardly the man personally responsible for all of them.

    When I type "Nixon" into Google, I expect to see biographical material, both good and bad, not totally unrelated rantings. Google is doing its job, in my opinion. It is giving low rankings to Brandt's irrelevant materials. His complaints are pure self-centered sour grapes.
  • by lunenburg ( 37393 ) on Thursday August 29, 2002 @11:21AM (#4164097) Homepage
    This guy's just whining because Google doesn't rank pages according to his crackheaded counterculture views? And this is news?

    Google must be doing pretty well if this is the worst criticism they can find about them.
  • "Google ranks my muckraking site rather low with regard to searches on indivduals, so the algorithms they are using must be EVIL! EVIL!"
  • I've never seen where Google has put a cookie that does more than save my search settings. In fact I've never seen where it saves all my search terms. What is doing the search term saving is Internet Explorer doing the auto form filling bit.

    This chap seems to be little more than someone who is holding a grudge against google because his website isn't as high on the list as he wants.

    Well Tough @#$%, life sucks doesn't it.

    What this guy needs to learn that what helps out with your score on Google isn't just the content, but how many people link to your site for that information. Thus having a page on Rumsfeld isn't as helpful as being a webpage on Rumsfeld that 50 sites refer to you.

    If this guy wants a higher ranking then he has to make relationships with other websites to get his rankings up. It's not that hard as most webmasters know this and a link sharing helps them as much as it would him.

    He's just a whiny person who happened to catch the attention of some person who needed to fill out todays news space on Salon.

    Ignore him and hopefully he'll go away.

    Phoenix

  • The last time I checked, Google wasn't a democracy. If it was, I wouldn't have voted for that name. Since it isn't - oh, well.

    There's one in every crowd.
  • by Anonymous Coward
    Last year we were able to use Namebase to identify a rogue investor as having trained at the knee of Robert Vesco. Remember Vesco? The most successful international swindler of all time, and friend of the Whitehouse plumbers? Same guy. Ordinary due diligence did not turn up this information. Brandt may be offkey on Google, but he gets my vote of thanks.
  • by Aix ( 218662 ) on Thursday August 29, 2002 @11:23AM (#4164111) Homepage
    More than that, says Brandt, Google is a careless custodian of private information. When you search for something at Google, it saves your search terms and associates them with a cookie that is set to live on your machine for 36 years. Brandt fears that law enforcement officials could muscle Google into divulging all the terms you've ever searched for. Those terms could be "a window into your state of mind," and are therefore a clear violation of your privacy, he says.


    Maybe I'm missing something here, but how is this a violation of your privacy? I mean, the whole thing is that you are using their service for free and willfully sending them the data that you choose. Everyone gets to choose what they search for in a search engine. This isn't private information in any real way. Google is providing you the free service of looking up words that you have intentionally provided. You don't like them being associated with a cookie? Refuse the damn cookie! Really paranoid? Go wander the web on your own without a search engine!


    At what point were you guaranteed the free and anonymous use of a search engine? You're not being forced to use it. The world doesn't discriminate against people who do not choose not to efficiently search the web.



    People like this are blurring the privacy issue and focusing attention away from legitimate privacy issues.

  • by lingqi ( 577227 ) on Thursday August 29, 2002 @11:24AM (#4164112) Journal
    "I am some poor guy who runs a second-grade website and since I can't get google to list me high, I will elicit some news media to get my site slashdotted"

    hope you like your servers toasty, bud.
  • Moderation of hits? (Score:3, Interesting)

    by paladin_tom ( 533027 ) on Thursday August 29, 2002 @11:24AM (#4164114) Homepage

    If letting Google rank the pages is undemocratic, what about a system in which, when you go to a page from a Google search, Google adds a frame at the top of your page that let's you vote on how useful this page was on a scale of 1-10?

    Then, the most popular hits for a given set of search words would have their Google ranking rise. Now that's democracy.


    • If letting Google rank the pages is undemocratic, what about a system in which, when you go to a page from a Google search, Google adds a frame at the top of your page that let's you vote on how useful this page was on a scale of 1-10?


      Are you serious?

      Do you think someone might think to abuse this system? Automated form filling, anyone? Even if that were prevented, it wouldn't be too hard or expensive to hire hundreds of low paid data entry people to vote a site up.

      The google alogorithm can be manipulated to some extent but it has stood up pretty well so far. A voting system could be manipulated much more easily.
  • by mblase ( 200735 ) on Thursday August 29, 2002 @11:26AM (#4164134)
    Google's PageRank algorithm, the celebrated system by which Google orders search results, is not, as Google says, "uniquely democratic" -- it's "uniquely tyrannical." PageRank is the "opposite of affirmative action," he has written, meaning that the system discriminates against new Web sites and favors established sites
    So Google gives preference to established sites that have proven themselves in the mind of other Web sites to be content-worthy. So what? This isn't "tyrranical" at all -- "tyrranical" would mean that they decide which sites go up and which ones don't, and where. (Yahoo!, in other words.) "Democratic" means they let the rest of the Internet "vote" on which sites are most relevant, based on hyperlinks.

    What this guy wants, by abolishing PageRank, is a return to the free-for-all of early search engines, where the loudest voice rules. If one page has more keywords, it's ranked higher -- whether or not those keywords appear in the context of relevant content.

    When you type "NameBase" into Google, Brandt's site comes up first, but Brandt is not satisfied with that. "My problem has been to get Google to go deep enough into my site," he says. In other words, Brandt wants Google to index the 100,000 names he has in his database, so that a Google search for "Donald Rumsfeld" will bring up NameBase's page for the secretary of defense.
    Here's his real problem: he thinks that linking to "Donald Rumsfeld" should bring his site's page to the top, despite the fact that he has no actual content -- just a list of links to other pages with content.

    He calls this a failing of PageRank. I call it whining. If he wants more links from Google, he should get the word out about his site (preferably without manipulating Salon.com into doing it for him) and add some actual information about the people he's archiving by hand, instead of just building a big hotlist about them.

    In Brandt's ideal world, if you searched for "United Airlines," you would see untied.com -- a site critical of United -- before you see United's page. And if you searched for Rumsfeld, you'd see NameBase's dossier on him before the Defense Department's site on the "The Honorable Donald Rumsfeld."
    Basically, he wants to be the tyrant he imagines Google to be. Well, let him want all he likes. Google's popular because it's good and it's relevant; the fact that a tiny tiny minority think it's not isn't a good reason to overthrow the whole system.

    People who live in glass houses shouldn't throw stones. He should start by making changes on his own site, not insisting Google make changes on there.
  • by lingqi ( 577227 ) on Thursday August 29, 2002 @11:26AM (#4164135) Journal
    Google responds by stating that now all of their pigeons will go through an "intruduction to democracy" short course, and all "bird seed" websites are now ranked by humans instead of the patented "pigeon rank" system.
  • by happystink ( 204158 ) on Thursday August 29, 2002 @11:26AM (#4164140)
    People can theorize about Pagerank all they want and come up with 100 theories of why it's not correct and won't give you good results.. but guess what, that's all in theory, and in reality, Google gives amazing results. Pagerank will probably fall by the wayside in the years to come as more sophisticated algorhythms come along, but for now, it is ludicrous to suggest that it doesn't work, when you just have to search for anything on Google to see it's usefulness.

    Also, this guy claims that Google keeps a record of what everyone searches for.. what proof does he have of this? That Google sends a cookie? That cookie is more likely than anything just used for tracking how often most people use the site, so they can create aggregate numbers of unique users, etc. Sure they could be tracking every search term, but why would they, think how much storage space that'd waste for no return. If the FBI ever wants to find out what this guy searches for, they'll just contact his ISP and have him monitored that way.
  • by Angst Badger ( 8636 ) on Thursday August 29, 2002 @11:27AM (#4164144)
    Brandt sounds like a whiny crank who is rather missing the point. OTOH, he is correct that Google applies a persistent tracking ID via cookies, which I had not previously noticed, and about which I'm not terribly happy. And no, I don't think Google has any sinister motives, and I wouldn't be surprised if they use the tracking ID in some way to enhance the effectiveness of their engine --- but having my searches tracked in any way still makes me uncomfortable. I'd like to hear why it is being done.

    In the meantime, anyone who would like to cover their tracks can use my cookie:

    .google.com TRUE / FALSE 2147368045 PREF ID=111439b95052c72a:TM=1030056425:LM=1030056425:S= v7T9QSFKEkI

    Of course, if it turns out that Google is planning to give a prize to the most active user, or they have some kind of search engine green stamps, you're screwed. ;)

    • Brandt sounds like a whiny crank who is rather missing the point. OTOH, he is correct that Google applies a persistent tracking ID via cookies, which I had not previously noticed, and about which I'm not terribly happy.

      You're right, this is terrifying!

      I can see the google conversations now:

      Employee: Sir, our servers have indicated person #111439b95052c72a has a really interesting search pattern! I think we should send this info to the FBI for investigation!

      Boss: Good work! Tell me his name and address, and we'll send that info over to the feds right away.

      Employee: Ummm, name and address? How about a few dozen IP addresses from AOL's proxy servers instead?

      Boss: Doh.
  • When you type "NameBase" into Google, Brandt's site comes up first, but Brandt is not satisfied with that. "My problem has been to get Google to go deep enough into my site," he says. In other words, Brandt wants Google to index the 100,000 names he has in his database, so that a Google search for "Donald Rumsfeld" will bring up NameBase's page for the secretary of defense. -- From the Salon article

    So it seems that this guy's real problem isn't with how Google ranks his site, but rather that Google isn't pushing his product to every searcher who hits their site. So he talks about the "undemocracy" of Google, but when it comes down to it, his main issue is that Google isn't helping his business, or rather, that Google's ranking algorithm isn't compatible with his business plan.

    Too often, when people say something is undemocratic, it's just because they aren't getting there own way.

  • by Bogatyr ( 69476 ) on Thursday August 29, 2002 @11:27AM (#4164148) Homepage
    First, a link to the article:
    http://www.salon.com/tech/feature/2002/0 8/29/googl e_watch/index.html

    (might be a space inserted in the URL by the browser submission, apologies)

    Second, a quote from the article:
    "Brandt sees this as Google's major flaw. "I'm not saying there aren't some sites that are more important that others, bu t in Google the sites that do well are the spammy sites, sites which have Google psyched out, and a lot of big sites, corporate headquarters' sites -- they show up before sites that criticize those companies.

    In other words, Brandt recognizes that ther e has to be some order to Google's results, and that some sites might deserve to come up before others. He just disagrees with the way Google does it. In Brandt's ideal world, if you searched for "United Airlines," you would see untied.com -- a site crit i cal of United -- before you see United's page. And if you searched for Rumsfeld, you'd see NameBase's dossier on him before the Defense Department's site on the "The Honorabl e Donald Rumsfeld."

    I must disagree with the ideal expressed here as Mr. Bran dt's. If I was searching for material on the Web about Donald Rumsfeld, I would rarely search for information critical of him *first*. If I was ego surfing on myself, I'd want to see my own material about me returned by Google, ahead of negative reviews and sites. I don't think that's an unfair way for Google to operate. While some of the issues Mr. Brandt raises might be valid, I do not feel that Google is required to promote or support Mr. Brandt's agenda over the agenda of the people and organizations Mr. Brandt chooses to focus on. M
  • k00k? (Score:2, Interesting)

    by TheTick ( 27208 )

    Is it just me, or does this guy sound like yet another internet kook? Get "untied.com" ranked first when searching for "united airlines"? That makes no sense.

    Google is a system -- a system that works a certain way. His complaints about PageRank are like complaining about an automobile for the way its wheels go 'round and 'round.

    I'm surprised salon dedicated any article space to this.

  • by Anonymous Coward
    This is especially popular with protesters [google.com], using the pagerank algorithm to rank higher than the company / organisation that they are protesting about. Its called google bombing. If he hates google so much then google bomb his site.
  • by Anonvmous Coward ( 589068 ) on Thursday August 29, 2002 @11:30AM (#4164174)
    I love this quote:

    "Having a thousand links from sites that are performing poorly does no good!"

    Here's how I interpret it:

    "Only sites that have really marketed themselves are viable. Pay me to make your site more popular."

    Anybody get an impression that he was saying something different? I don't think the guy understands that his comment makes no sense if Google's popularity's based on people finding what they're looking for. It seems more to me he's just mad that he can't buy search rankings.
  • Namebase vs. Google (Score:4, Interesting)

    by swm ( 171547 ) <swmcd@world.std.com> on Thursday August 29, 2002 @11:32AM (#4164194) Homepage

    NameBase...is...designed...to find books and newspaper articles that mention a specific person. For example, if you're trying to find out who this character "Donald Rumsfeld" you keep hearing about on the news is, you'd type it into NameBase and find about 50 books and articles that mention the man.

    No, I'd type it into Google, where I'd find about 130,000 pages that mention him, and the top hit is

    Biography of Donald H. Rumsfeld ...
    THE HONORABLE DONALD RUMSFELD. Secretary of Defense. Photo: Secretary
    Donald H. Rumsfeld. Until being sworn in as the 21st Secretary ...
    www.defenselink.mil/bios/rumsfeld.html - 11k - Cached - Similar pages

    Well, what do you know, he's Secretary of Defense.
  • by Gallowglass ( 22346 ) on Thursday August 29, 2002 @11:33AM (#4164205)
    From the article:

    When you type "NameBase" into Google, Brandt's site comes up first, but Brandt is not satisfied with that. "My problem has been to get Google to go deep enough into my site," he says. In other words, Brandt wants Google to index the 100,000 names he has in his database, so that a Google search for "Donald Rumsfeld" will bring up NameBase's page for the secretary of defense.

    So Mr. Brandt wants his internal pages to be the most important page about the person rather than, say, the person's own home page. Personally, I take leave to doubt that Mr. Brandt's opinion of a person is the ultimate reference in most cases (if any).

    To me, it's the old, common story of, "You're not doing what I want. You're mean and unfair!" One expects this in the kindergarden sand box, but not in adults. Alas, the latter expectation is all too often unmet.

    World is full of whineboxes.
  • I do not use Google to "browse" for information or find the one site that has it all and looks pretty. Normally I know exactly what I am looking for, or at least something very specific. I just need to find it. I start near the top of the Google results and work my way down until I get my answer or enough information to solve my quest. If I can't find it, I try different keywords. A resent search I had was an example of a fetchmailrc using preauth. Sure, there may be a few top notch fetchmail sites (and thousands of copies of the man page) out there but I'd be wasting my time viewing them if they did not have the specific example I am looking for. If I want general information on a subject, I make my searches simpler or use the Google catagories. If this guys page truely is as good as he believes, his creation will eventually make it's way up the ladder.
  • ... because my page is unpopular.
  • Search: Netherlands history
    Result: Teen Catholic barely-legal sluts from Holland!

    Search: Mali Timbuktu empire
    Result: Malian Cum-Slurping Sluts! Timbuktu Kama Sutra Style Mature Singles Waiting For You!

    Thank goodness Google is here, even if it's not 100% perfect.
  • by ivan256 ( 17499 ) on Thursday August 29, 2002 @11:53AM (#4164369)
    in Google the sites that do well are the spammy sites, sites which have Google psyched out, and a lot of big sites, corporate headquarters' sites -- they show up before sites that criticize those companies."

    In other words, Brandt recognizes that there has to be some order to Google's results, and that some sites might deserve to come up before others. He just disagrees with the way Google does it. In Brandt's ideal world, if you searched for "United Airlines," you would see untied.com -- a site critical of United -- before you see United's page. And if you searched for Rumsfeld, you'd see NameBase's dossier on him before the Defense Department's site on the "The Honorable Donald Rumsfeld."


    He wants google to be a political action site that favors his views. He's a whiny little baby.

    Sites that critisize corporations should appear before the corporations main site? Why? Did you search for the company or for criticism? If the company/group in question was something he agreed with, perhaps some environmental organization or the democratic national commitee, would he want criticism of them to come up first too?

    A quick stop at google shows that if you search for "United Airlines" you get their site first, and the site he thinks should be first shortly thereafter. If you search for "United Airlines criticism" you get the site he reccommends first. Looks like google is doing it's job correctly to me.

    Why is salon publishing the crap?
  • by Remus Shepherd ( 32833 ) <remus@panix.com> on Thursday August 29, 2002 @11:58AM (#4164402) Homepage
    Soon to be announced: Google for Wackos! With a clean-cut, cookie-less interface free of CIA influence, Google for Wackos will return search results based not on the listed sites' popularity, but on the wackiness of the conspiracy theories they present. Most popular search terms include Zapruder, tin foil, UFOs, and of course sex (but only the dirty illegal kind that politicians have.)
  • Cache of site (Score:3, Informative)

    by perlyking ( 198166 ) on Thursday August 29, 2002 @11:58AM (#4164403) Homepage
    You can find a cache of the site at the wayback machine [archive.org].
    I know the guy will be gutted about that :)
  • by Rupert ( 28001 ) on Thursday August 29, 2002 @12:03PM (#4164441) Homepage Journal
    The answer is in the article. Six years ago, everyone used Yahoo. Then Yahoo went all portal on us, so the smart geeks started using Altavista. Then Altavista started selling #1 listings, so we all decamped to Google. Now everyone uses Google.

    Brandt's complaint appears to be that he has a database of citations, but when you search for Donald Rumsfeld his site is more than 10 pages down, where nobody ever looks. And that's fine with me. That's what I expect from Google. He obviously expects something else (like united.com appearing higher than United Airlines real site), and being the kind of person he apparently is, he expects Google to change to become how he expects them to be, rather than realigning his expectations with reality.
  • by Pedrito ( 94783 ) on Thursday August 29, 2002 @01:08PM (#4164940)
    I'm sorry, I missed the "The most democratic search engine in the world" quote on the Google web site. Can someone post that link for me?

    Google's page rank isn't democratic, and thank God for that. Otherwise I'd have to wade through a bunch of crap that I generally don't want to wade through.

    Different search engines are better at searching for different things, but Google is my first choice almost every time. It is, by far, the most effective search engine I've seen. If it wasn't, I don't think it would be the most popular.

    Someone explain to me why anyone pays attention to this guy.
  • Here's my essay (Score:4, Interesting)

    by Everyman ( 197621 ) on Thursday August 29, 2002 @01:12PM (#4164977) Homepage
    Hi all. I'm the evil Daniel Brandt who has the gall to criticize your beloved Google. Sorry the site is down. We're being synflooded, apparently by one or more slashdotters, since it started with the slashdot post. It's probably one of those who posted here, saying that if we can't keep our site going, then we don't belong in Google. We have our own router, so we hope to be able to clear things up shortly.

    A few points missed in the Salon piece:

    I specifically pointed out to the author of the piece when he interviewed me, that I felt my site did okay in Google, and that I was speaking for the public interest. The so-called "royal we" that Mr. Manjoo, the interviewer and author, refers to sarcastically, is used because I'm speaking for a tax-exempt, nonprofit public charity, Public Information Research, Inc. We do not sell widgets. Some of the comments in Slashdot have me mixed up with another person who is selling ads based on PageRank. But then, who expects Slashdotters to actually read the article?

    My main site in Google is www.pir.org and it has a PageRank of 7. The www.namebase.org, with a PR of 6, is a streamlined CGI version of the main site, without all the essays and cartoons. NameBase began in the early 1980s and has been on the Internet since early 1995.

    The other problem I have with the author's spin is that a good half of the interview was about Google's cookie. Most of the work I put into www.google-watch.org has to do with the cookie. In the article, the cookie is briefly mentioned, and most of the article is about how selfish and silly I am to think that Google should rank me higher.

    My complaint about Google is not that PIR got the short end of the stick from Google, but that Google's stick should be longer.

    My essay about PageRank is below.

    _____________________

    PageRank: Google's Original Sin

    by Daniel Brandt

    By 1998, the dot-com gold rush was in full swing. Web search engines had been around since 1995, and had been immediately touted by high-tech pundits (and Forbes magazine) as one more element in the magical mix that would make us all rich. Such innovations meant nothing less than the end of the business cycle.

    But the truth of the matter, as these same pundits conceded after the crash, was that the false promise of easy riches put bottom-line pressures on companies that should have known better. One of the most successful of the earliest search engines was AltaVista, then owned by Digital Equipment Corporation. By 1998 it began to lose its way. All the pundits were talking "portals," so AltaVista tried to become a portal, and forgot to work on improving their search ranking algorithms.

    Even by 1998, it was clear that too many results were being returned by the average search engine for the one or two keywords that were entered by the searcher. AltaVista offered numerous ways to zero in on specific combinations of keywords, but paid much less attention to the "ranking" problem. Ranking, or the ordering of returned results according to some criteria, was where the action should have been. Users don't want to figure out Boolean logic, and they will not be looking at more than the first twenty matches out of the thousands that might be produced by a search engine. What really matters is how useful the first page of results appears on search engine A, as opposed to the results produced by the same terms entered into engine B. AltaVista was too busy trying to be a portal to notice that this was important.

    Enter Google

    By early 1998, Stanford University grad students Larry Page and Sergey Brin had been playing around with a particular ranking algorithm. They presented a paper titled "The Anatomy of a Large-Scale Hypertextual Web Search Engine" at a World Wide Web conference. With Stanford as the assignee and Larry Page as the inventor, a patent was filed on January 9, 1998. By the time it was finally granted on September 4, 2001 (Patent No. 6,285,999), the algorithm was known as "PageRank," and Google was handling 150 million search queries per day. AltaVista continued to fade; even two changes of ownership didn't make a difference.

    Google hyped PageRank, because it was a convenient buzzword that satisfied those who wondered why Google's engine did, in fact, provide better results. Even today, Google is proud of their advantage. The hype approaches the point where bloggers sometimes have to specify what they mean by "PR" -- do they mean PageRank, the algorithm, or do they mean the Public Relations that Google does so well:

    PageRank relies on the uniquely democratic nature of the web by using its vast link structure as an indicator of an individual page's value. In essence, Google interprets a link from page A to page B as a vote, by page A, for page B. But, Google looks at more than the sheer volume of votes, or links a page receives; it also analyzes the page that casts the vote. Votes cast by pages that are themselves "important" weigh more heavily and help to make other pages "important."

    Google goes on to admit that other variables are also used, in addition to PageRank, in determining the relevance of a page. While the broad outlines of these additional variables are easily discerned by webmasters who study how to improve the ranking of their websites, the actual details of all algorithms are considered trade secrets by Google, Inc. It's in Google's interest to make it as difficult as possible for webmasters to cheat on their rankings.

    It's all in the ranking

    Beyond any doubt, search engines have become increasingly important on the web. E-commerce is very attuned to the ranking issue, because higher ranking translates directly into more sales. Various methods have been designed by various engines to monetize the ranking situation, such as paid placement, pay per click, and pay for inclusion. On June 27, 2002, the U.S. Federal Trade Commission issued guidelines that recommended that any ranking results influenced by payment, rather than by impartial and objective relevance criteria, ought to be clearly labeled as such in the interests of consumer protection. It appears, then, that any algorithm such as PageRank, that can reasonably pretend to be objective, will remain an important aspect of web searching for the foreseeable future.

    Not only have engines improved their ranking methods, but the web has grown so huge that most surfers use search engines several times a day. All portals have built-in search functions, and most of them have to rely on one of a handful of established search engines to provide results. That's because only a few engines have the capacity to "crawl" or "spider" more than two billion web pages frequently enough to keep their database current. Google is perhaps the only engine that is known for consistent, predictable crawling, and that's only been true for less than two years. It takes almost a week to cover the available web, and another week to calculate PageRank for every page. Google's main update cycle is about 28 days, which is a bit too slow for news-hungry surfers. In August, 2001 they also began a second "mini-crawl" for news sites, which are now checked every day. Results from each crawl are mingled together, giving the searcher an impression of freshness.

    For the average webmaster, the mechanics of running a successful site have changed dramatically from 1996 to 2002. This is due almost entirely to the increased importance of search engines. Even though much of the dot-com hype collapsed in 2000 and 2001 (a welcome relief to noncommercial webmasters who remembered the pre-hype days), the fact remains that by now, search engines are the fundamental consideration for almost every aspect of web design and linking. It's close to a wag-the-dog situation. That's why the algorithms that search engines consider to be consistent with the FTC's idea of impartial and objective ranking criteria deserve closer scrutiny.

    What objective criteria are available?

    Ranking criteria fall into three broad categories. The first is link popularity, which is used by a number of search engines to some extent. Google's PageRank is the original form of "link pop," and remains its purest expression. The next category is on-page characteristics. These include font size, title, headings, anchor text, word frequency, word proximity, file name, directory name, and domain name. The last is content analysis. This generally takes the form of on-the-fly clustering of produced results into two or more categories, which allows the searcher to "drill down" into the data in a more specific manner. Each method has its place. Search engines use some combination of the first two, or they use on-page characteristics alone, or perhaps even all three methods.

    Content analysis is very difficult, but also very enticing. When it works, it allows for the sort of graphical visualization of results that can give a search engine an overnight reputation for innovation and excellence. But many times it doesn't work well, because computers are not very good at natural language processing. They cannot understand the nuances within a large stack of prose from disparate sources. Also, most top engines work with dozens of languages, which makes content analysis more difficult, since each language has its own nuances. There are several search engines that have made interesting advances in content analysis and even visualization, but Google is not one of them. The most promising aspect of content analysis is that it can be used in conjunction with link pop, to rank sites within their own areas of specialization. This provides an extra dimension that addresses some of the problems of pure link popularity.

    Link popularity, which is "PageRank" to Google, is by far the most significant portion of Google's ranking cocktail. While in some cases the on-page characteristics of one page can trump the superior PageRank of a competing page, it's much more common for a low PageRank to completely bury a page that has perfect on-page relevance by every conceivable measure. To put it another way, it's frequently the case that a page with both search terms in the title, and in a heading, and in numerous internal anchors, will get buried in the rankings because the sponsoring site isn't sufficiently popular, and is unable to pass sufficient PageRank to this otherwise perfectly relevant page. In December 2000, Google came out with a downloadable toolbar attachment that made it possible to see the relative PageRank of any page on the web. Even the dumbed-down resolution of this toolbar, in conjunction with studying the ranking of a page against its competition, allows for considerable insight into the role of PageRank.

    Moreover, PageRank drives Google's monthly crawl, such that sites with higher PageRank get crawled earlier, faster, and deeper than sites with low PageRank. For a large site with an average-to-low PageRank, this is a major obstacle. If your pages don't get crawled, they won't get indexed. If they don't get indexed in Google, people won't know about them. If people don't know about them, then there's no point in maintaining a website. Google starts over again on every site for every 28-day cycle, so the missing pages stand an excellent chance of getting missed on the next cycle also. In short, PageRank is the soul and essence of Google, on both the all-important crawl and the all-important rankings. By 2002 Google was universally recognized as the world's most popular search engine.

    How does PageRank measure up?

    In the first place, Google's claim that "PageRank relies on the uniquely democratic nature of the web" must be seen for what it is, which is pure hype. In a democracy, every person has one vote. In PageRank, rich people get more votes than poor people, or, in web terms, pages with higher PageRank have their votes weighted more than the votes from lower pages. As Google explains, "Votes cast by pages that are themselves 'important' weigh more heavily and help to make other pages 'important.'" In other words, the rich get richer, and the poor hardly count at all. This is not "uniquely democratic," but rather it's uniquely tyrannical. It's corporate America's dream machine, a search engine where big business can crush the little guy. This alone makes PageRank more closely related to the "pay for placement" schemes frowned on by the Federal Trade Commission, than it is related to those "impartial and objective ranking criteria" that the FTC exempts from labeling.

    Secondly, only big guys can have big databases. If your site has an average PageRank, don't even bother making your database available to Google's crawlers, because they most likely won't crawl all of it. This is important for any site that has more than a few thousand pages, and a home page of about five or less on the toolbar's crude scale.

    Thirdly, in order for Google to access the links to crawl a deep site of thousands of pages, a hierarchical system of doorway pages is needed so that crawler can start at the top and work its way down. A single site with thousands of pages typically has all external links coming into the home page, and few or none coming into deep pages. The home page PageRank therefore gets distributed to the deep pages by virtue of the hierarchical internal linking structure. But by the time the crawler gets to the real "meat" at the bottom of the tree, these pages frequently end up with a PageRank of zero. This zero is devastating for the ranking of that page, even assuming that Google's crawler gets to it, and it ends up in the index, and it has excellent on-page characteristics. The bottom line is that only big, popular sites can put their databases on the web and expect Google to cover their data adequately. And that's true even for websites that had their data on the web long before Google started up in 1999.

    What about non-database sites?

    There are other areas where PageRank has a negative effect, even for sites without a lot of data. The nature of PageRank is so discriminatory, that it's rather like the exact opposite of affirmative action. While many see affirmative action as reverse discrimination, no one would claim (apart from economists who advocate more tax cuts for the rich) that the opposite, which would be deliberate discrimination in favor of the already-privileged, is a solution for anything. Yet this is essentially what Google claims.

    Those who launch new websites in 2002 have a much more difficult time getting traffic to their sites than they did before Google became dominant. The first step for a new site is to get listed in the Open Directory Project. This is used by Google to seed the crawl every month. But even after a year of trying to coax links to your new site from other established sites, the new webmaster can expect fewer than 30 visitors per day. Sites with a respectable PageRank, on the other hand, get tens of thousands of visitors per day. That's the scale of things on the web -- a scale that is best expressed by the fact that Google's zero-to-ten toolbar is a logarithmic scale, perhaps with a base of six. To go from an old PageRank of four to a new rank of five requires several times more incoming links. This is not easy to achieve. The cure for cancer might already be on the web somewhere, but if it's on a new site, you won't find it.

    PageRank also encourages webmasters to change their linking patterns. On search engine optimization forums, webmasters even discuss charging for little ads with links, according to the PageRank they've achieved for their site. This would benefit those sites with a lower PageRank that pay for such ads. Sometimes these PageRank achievements are the result of link farms or other shady practices, which Google tries to detect and then penalizes with a PageRank of zero. At other times professional optimizers get away with spammy techniques. Mirror sites and duplicate pages on other domains are now forbidden by Google and swiftly punished, even when there are good reasons for maintaining such sites. Overall, linking patterns have changed significantly because of Google. Many webmasters are stingy about giving out links (which can dilute your transference of PageRank to a given site), at the same time that they're desperate for more links from others.

    What should Google do?

    We feel that PageRank has run its course. Google doesn't have to abandon it entirely, but they should de-emphasize it. The first step is to stop reporting PageRank on the toolbar. This would mute the awareness of PageRank among optimizers and webmasters, and remove some of the bizarre effects that such awareness has engendered. The next step would be to replace all mention of PageRank in their own public relations documentation, in favor of general phrases about how link popularity is one factor among many in their ranking algorithms. And Google should adjust the balance between their various algorithms so that excellent on-page characteristics are not completely cancelled by low link popularity.

    PageRank must be streamlined so that the "tyranny of the rich" characteristics are scaled down in favor of a more egalitarian approach to link popularity. This would greatly simplify the complex and recursive calculations that are now required to rank two billion web pages, which must be very expensive for Google. The crawl must not be PageRank driven. There should be a way for Google to arrange the crawl so that if a site cannot be fully covered in one cycle, Google's crawlers can pick up where they left off on the next cycle.

    Google is so important to the web these days, that it probably ought to be a public utility. Regulatory interest from agencies such as the FTC is entirely appropriate, but we feel that the FTC addressed only the most blatant abuses among search engines. Google, which only recently began using sponsored links and ad boxes, was not even an object of concern to the Ralph Nader group, Commercial Alert, that complained to the FTC.

    This was a mistake, because Commercial Alert failed to look closely enough at PageRank. Some aspects of PageRank, as presently implemented by Google, are nearly as pernicious as pay for placement. There is no question that the FTC should regulate advertising agencies that parade as search engines, in the interests of protecting consumers. Google is still a search engine, but not by much. They can remain a search engine only by fixing PageRank's worst features.

    *

    [Daniel Brandt is founder and president of Public Information Research, Inc., a tax-exempt public charity that sponsors NameBase. He began compiling NameBase in 1982, from material that he started collecting in 1974, and is now the programmer and webmaster for PIR's several sites. He participates in various forums where webmasters share observations about the often-secretive algorithms, bugs, and behavior of various search engines. Brandt has been watching Google's interaction with NameBase ever since Google, in October, 2000, became the first search engine to go "deep" on PIR's main site by crawling thousands of dynamic pages.]

    • by schon ( 31600 ) on Thursday August 29, 2002 @04:42PM (#4166568)
      I don't get it - you go on at great lengths about what google should do; about how bad Pagerank is, and how it should be fixed. But you don't say why you're not doing it yourself.

      Google became what it is because it saw an unfilled niche, and filled it. They "built a better mousetrap", and the world did indeed beat a path to their door. There is nothing stopping you from doing the same. If you're half as smart as you seem to think you are, you should have no problem implementing a search engine, and becoming as successful as Google is now.

      Google is NOT a public utility, nor is it any form of monopoly. It needs to be regulated just as much as YOUR site does.

      Unlike so many other companies, Google got where it is today solely on the merits of it's technology. It didn't succeed by pumping millions of dollars into marketing, it didn't succeed by using underhanded business tactics to squash its' competitors. All it did was make the best product.

      Contrary to your essay, I (and I think many /.'ers) think that Pagerank works, and works very well. If you believe otherwise, why don't you simply go ahead and prove it?
  • NameBase sucks (Score:5, Insightful)

    by Animats ( 122034 ) on Thursday August 29, 2002 @02:37PM (#4165637) Homepage
    What a whiner. Have you looked at NameBase?
    • It's a search engine. You find info by typing names into a form. There are no obvious links to the content. How's that supposed to get spidered?
    • His search engine is overloaded right now and just returns error messages. Maybe that's what Google sees.
    • The good data is by subscription only: "And ask your library or student government to subscribe to NameBase ($200 for two years of unrestricted access from any campus computer) so that we can continue to add names, and you can continue to find them."
    • <meta NAME="GOOGLEBOT" CONTENT="NOARCHIVE"> can't be helping.
    • This guy is very picky about who gets to spider him. Here's his "robots.txt" file:
      User-agent: ia_archiver
      Disallow: /

      User-agent: scooter
      Disallow: /

      User-agent: mercator
      Disallow: /

      User-agent: psbot
      Disallow: /

      User-agent: SlySearch
      Disallow: /

      User-agent: *
      Disallow: /cgi-bin/
      Disallow: /zipdir/

    • He uses one-pixel GIFs to trap spiders. [namebase.org] He also uses cookies and web bugs, providing a long-winded explanation of why what he does is OK, but what Google does is evil.
    In conclusion, this guy created his own problem.

    I run three web sites. Each is at the top of the Google rankings for its obvious keywords, and I've done nothing whatsoever to make that happen. I just have useful content that people like.

If I want your opinion, I'll ask you to fill out the necessary form.

Working...