Slashdot is powered by your submissions, so send in your scoop

 



Forgot your password?
typodupeerror
×

Google's Site Ranking Secrets 309

vivin writes "Ever wonder how Google's site ranking works? Wonder no more. Google recently filed United States Patent Application 20050071741 on March 31, 2005. This patent reveals a great deal of information about Google's site ranking algorithm and makes very good reading. For example, one of the criteria that they use is the number of years that your site has been registered. If your site has been registered for less than a year, then it counts against you. A site registered for a longer period of time means that the owner is probably serious about the site, and the site is probably legitimate. Google's Site Ranking algorithms reveal how hard they are making it for spam sites to get listed (on Google). This information will also make it easier for you to make sure that you get listed well in Google."
This discussion has been archived. No new comments can be posted.

Google's Site Ranking Secrets

Comments Filter:
  • Note (Score:5, Insightful)

    by Leffe ( 686621 ) on Thursday June 16, 2005 @07:07AM (#12831191)
    Note that there is no guarantee that Google uses everything in the patent or that they don't use other methods not described in any of their other patents.
  • or (Score:3, Insightful)

    by lostpixel. ( 788696 ) on Thursday June 16, 2005 @07:07AM (#12831192) Homepage
    or conversely how spam website can get higher :)
    • Never mind spammers, what about all the folks in Europe and then the East Coast of the US who now have a higher rating than me here on the West Coast! This story was posted at 5:04AM, when I was fast asleep in bed and unable to act on this important news!
  • by henrywood ( 879946 ) on Thursday June 16, 2005 @07:08AM (#12831194)
    I prefer the official Google explanation:

    http://www.google.com/technology/pigeonrank.html [google.com]
    • I'm currently under negotiations with Google to see if I can use their massive ammounts of Pidgeon Clusters and a few statues I have handy to do some studies on the dynamics of white dialetric material.

      I'll probably just end up bullshitting the answers instead.

  • by dhasenan ( 758719 ) on Thursday June 16, 2005 @07:08AM (#12831196)
    'Google record the discovery of a link and link changes over time. The speed at which a site gains links and the link life span.' I fail to see how this would be helpful--if something's new and briefly popular, you only want to give it a high rank for a brief period and forget it once people stop linking. But if something's new and popular for a duration, you want to keep it well ranked.
    • by Peeteriz ( 821290 ) on Thursday June 16, 2005 @07:27AM (#12831289)
      If you have a good site with valuable information, then, over time, news of it will get around, and you will keep getting new links over time.

      However, if you have gotten 1000 links at once, and for the next months noone else is linking to you - then you have probably bought the initial links, but nobody real considers the content worthy of attention.
  • by Dancin_Santa ( 265275 ) <DancinSanta@gmail.com> on Thursday June 16, 2005 @07:08AM (#12831197) Journal
    Could someone explain how other crap search engines are getting high rankings in Google search?

    Sometimes when I search for something specific, I get a bunch of useless links that have results of other "search engines" that invariably show something similar to "0 results for your search terms 'sheep+barn+slashbot+erotica'"

    How do these sites get on the first page of Google results?
    • With the money involved, they will find a way. Basically, a cottage industry exists that is devoted to figuring out how to manipulate search engine results.
    • by jrumney ( 197329 ) on Thursday June 16, 2005 @07:20AM (#12831256)
      Yes, they're really annoying. They often have genuine looking summaries in Google's results, inticing you to click on them expecting to find useful information, but all you find is a page of links, often completely unrelated to your Google search. I wonder why Google hasn't got on top of them yet. All it would take is a second robot identifying itself as Internet Explorer slowly crawling the web looking for pages that give completely different results than the google spider.
      • I know, it seems so simple that Google just sends around another spider to reference their pages.

        Or, instead just change the browser tags of GoogleBot to random but human identifyable names to prevent sites from displaying only to googlebot.

        Instead of GoogleBot 2.0, it is "Google - Not Firefox - Bot 2.0" or "Internet Explorer via GBot v2". It would at least deter the sites that try to cheat google, and it would still identify to webmasters GoogleBot's traffic.

        But with all the spam sites, I dont see why
        • Nah, it's still be too easy to just regexp for UserAgents that match /google/ and /bot/.

          It might be against the Robots Exclusion Standards [robotstxt.org] to deliberately fake your UserAgent header, but that's mostly so you can contact the robot's owner if it goes wrong and accidentally DOSes your site.

          I doubt severely anyone would mind if Google did an occasional, low-impact, slow, back-up crawl disguised as IE (presumably also from an IP address block not known to belong to Google), especially since GoogleBot has only
        • I am not convinced Googlebot works. I have a website with a robot.txt begging for the Googlespiderbot to come by, and it still hasn't.

          The page is still not available on Google, and I don't even care about ranking high. Every other search engine has found the site except Google.

      • by stripmarkup ( 629598 ) on Thursday June 16, 2005 @07:51AM (#12831429) Homepage
        This type of spam (showing a page to the crawler and another to the user) is called cloaking. Cloakers have anticipated this sort of move and can detect a search engine's crawler by not just the user agent but also the IP address range it comes from and other heuristics. In order to beat them, search engines would have to crawl from unpredictable IP addresses and behave like regular users.

        A while back I proposed a distributed approach like this in the Nutch mailing list [mail-archive.com]. The problem is that it would be hard to implement and it may not be worth the effort, since there are cheaper ways to fight spam.
    • by shird ( 566377 ) on Thursday June 16, 2005 @07:29AM (#12831307) Homepage Journal
      They are scaper sites.

      They get to the top through link spamming, 302 hijacks, "scaping" content from other sites, search engine optimisation etc etc etc.

      They are sites "made for adsense" as its called, whereby they exist for the sole purpose to be highly ranked in google and get ad clicks from people looking for something else. Effectively 'doorway' pages, which make a shitload of money, as people that land on such pages don't find what they really want, so click through on the ads in hopes of finding it there instead.

      The crap of the internet, many hundreds of thousands of such sites run by only a hanful of thousand very rich people.
    • by Pig Hogger ( 10379 ) <.moc.liamg. .ta. .reggoh.gip.> on Thursday June 16, 2005 @07:29AM (#12831308) Journal
      Could someone explain how other crap search engines are getting high rankings in Google search?
      That's not as bad as getting mailing list archives. You search for a particular Linux error message, and what you get is archives of mailing list messages of guys who ask precisely the same question, with a lot of "me too" follow-ups, but no definite answer to your problem, that is if you can manage to find the link that leads to the follow-ups... Because heaven forbids mailing list archive software offers standardized navigation...
      • by Anonymous Coward
        But mailing list archives are valid results in that case, in fact I like them because they confirm it's a (more or less) common problem and often there will be a helpful reply. And yeah, mailing list archive navigation sucks bigtime...
      • by Leffe ( 686621 ) on Thursday June 16, 2005 @07:56AM (#12831469)
        Try figuring this [ruby-talk.org] wonderful navigation out and I'll give you a cookie!

        Hey, there's a help button... *clicks*... Oh God...
      • That's not as bad as getting mailing list archives.

        Mailing list archives are almost always exactly what I was looking for. They are pure information, no marketing fluff like the official pages for a product often are. The problem is that some mailing lists have a gazillion mirrors, so if the first hit isn't exactly what you were looking for, you have to flick through 10 pages of exactly the same result before you get to something else that might solve your problem. A voluntary standard for mirrors of mail

    • How do these sites get on the first page of Google results?

      1. Do-it-yourself search site (doesn't matter if it uses google, just put it there)
      2. Store in a database the most used search terms
      3. Produce a list of links with query strings that contain the search terms
      4. Wait for google to index the page and crawl it
      5. ???
      6. Profit!!
  • They've thrown every technique they could have thought of into the patent purely as a defensive mechanism to prevent other major engines from patenting them. Some of the techniques are thrown in as defensive FUD to prevent newbies from using them.

    Some of these techniques are just plain old bizzare and might be way too difficult to approach algorithmically.

    Oh well .. what do I know ..
    • Some of these techniques are just plain old bizzare and might be way too difficult to approach algorithmically.

      In order for a patent claim to be valid, a prototype must exist. Perhaps the most important word here is claim. There are two main parts of a patent: a description which is provided for purposes of explanation, and the claims which are the really important part. It's not uncommon for several patents to use the same description; last I knew, my previous employer was pursuing four patents on

  • Step 1: Build time machine
    Step 2: Go 5 years into past, buy domain names, set up sites with lots of soft porn images
    Step 3: Return to present, stopping off each year on the way to renew domains. Step 4: Sell to spammers etc.
    Step 5: Profit.

    I'm open to venture capitalists for investment in this one.

    • Step 1: Build Time Machine
      Step 2: Go back to 1994 and register Google.com (and Google.net and Google.org) before Google does.
      Step 3: Offer to sell the domains to them for 0.25% ownership of the company, and 0.5% of the stock to be issued in any "hypothetical" "future" IPO [yahoo.com]; this should be small enough they'll cough up without hesitating.
      Step 4: Pop back to 1977 and pick up 100 shares of Berkshire Hathaway while you're about it.
      Step 5: Profit!

      Think Big. Win Small. --Darius Regulo, the King of Heaven

      • Probably all that would do is cause them to pick a different name for the search engine. Unless you registered every possible domain, they could just choose from any that you hadn't registered yet.

        Besides, it seems like a lot of work when you already have a time machine. An easier solution would be:

        1. Find an event you can bet on with a huge payout.
        2. Go back in time and bet as much as you can.
        3. Go to step 2 until you're sick of money.
  • SEOs make me barf (Score:5, Insightful)

    by Anonymous Coward on Thursday June 16, 2005 @07:12AM (#12831212)
    Argh... quit trying to game the system! If you read the article, it's entirely from the perspective of someone trying to corrupt the rankings for financial gain. Here's an idea: make good, useful web pages, rather then spending all your time an energy creating these BS link farms. The SEO world is the modern day equivilent of snake-oil salesmen.
    • Re:SEOs make me barf (Score:5, Interesting)

      by Momoru ( 837801 ) on Thursday June 16, 2005 @07:39AM (#12831373) Homepage Journal
      I agree they can be evil...but one thing Google lacks is giving new sites some priority....say i come out with the best tech site ever, but I have no money to advertise with, how do i get it popular? Ok i submit it to Google. I appear on page 5000 of the results. I have to beg people to link to my site, maybe spam a couple of blogs, i dunno...the thing is without the tricks, its almost impossible to get your new site to appear in the search results. And even with them its still pretty difficult. I think maybe google should have a special section of "new to the web" or whatever to give these sites publicity. In the old days, the yahoo directory kind of put all decent sites on even ground.
      • Are you serious? (Score:3, Informative)

        by KalvinB ( 205500 )
        I posted a binary for mod_proxy_html at my site along with a how to on compiling it and was listed on Google's front page (currently number 5) within a week. It was a small project that a major aerospace company needed. They actually found my page through Google before we notified them it was there by e-mail.

        Submitting the site to Google is a negative in their algorithm. Back when I had therabbithole.redback.inficad.com for my domain name Google found my site within a month.

        You can't be successful in a
      • I agree they can be evil...but one thing Google lacks is giving new sites some priority....say i come out with the best tech site ever, but I have no money to advertise with, how do i get it popular?

        Who says your site is the "best tech site ever"? What if I decide that *my* site is the "best tech site ever" and game the system to bump out yours?

        The site that has truely good content and that has been around longer should be ranked highest. Trying to manipulate your page rank using any other means is a lit
      • ...how do i get it popular?

        Which is precisely what pagerank, in essence, was designed to prevent. YOU can't make it popular, other people have to choose to do it by deciding your site is interesting and by linking it.

        As such, Google only reflects your popularity. If everyone could manipulate their results to "get it popular" on Google, then no one's site would be.

        As to getting yours popular, write articles, participate in blogs, get /.'ed.

        (Of course, if you have no money for advertising then you

    • SEO's are worse than snake oil salesmen. Snake oil salesmen have a product.

      They're worse than lawyers. With a lawyer you at least *know* you're going to get ripped off.

      They're more like chiropractors. The only reason you need one is that you don't want to do the work (building a good site/product or living a healthy lifestyle).
    • Not that simple (Score:5, Insightful)

      by Darkman, Walkin Dude ( 707389 ) on Thursday June 16, 2005 @08:01AM (#12831520) Homepage

      No, its not that simple. Lets say I have a small business, I sell garden tools, lawnmowers,etc, in a certain region. And yet I do a search on google for garden tools + region, I am nowhere to be found. What do I do? I optimise the hell out of my site, caking it with region name + garden tools information, and I set up a links exchange program, getting in links left right and centre from related sites. This is SEO, and it will only affect people that enter a search for "garden" "tools" "my region". In other words, those that actually want to find my site.

      Theres a distinction between SEO and spamming; if I was to optimise for a garden tools site and set up a poker site there, that would be spamming.

      • Re:Not that simple (Score:4, Insightful)

        by Shaper_pmp ( 825142 ) on Thursday June 16, 2005 @09:23AM (#12832056)
        I hate SEO with a passion (and I have to do it as part of my job), but you're right.

        The way I see it, SEO is a tool - nothing more, nothing less. It isn't inherently evil or inherently good - it's how you use it and what you use it for that matters.

        If you've got a good site on... i dunno... aardvark polishing for fun and profit, then you should rank highly on Google. If you don't rank well on Google, it's probably because your site is lacking one of fame, content or clean code. All of these are necessary for (or inevitable side-products of) a good site that does what people want.

        Conversely, a good site will probably have many inbound links, clean semantic markup, well-focused pages full of good content and so on. This is simply good site design (or, like the links, a side-effect of it), but it's also the very ethical end of the SEO spectrum.

        Now, you also get evil scumbag fuckwits-for-hire who specialise in link-farming, keyword stuffing, cloaking and other black-hat techniques, and sell their services to shitty pr0n or spam sites. This is spam - no doubt about it - but it only represents the black-hat side of SEO.

        The black-hat SEOers, it must be admitted, are the one which gets all the attention. They're the ones advertising like mad, making overblown claims, spamming search engines with crap listings and generally getting in people's faces. However, just because these people use SEO doesn't make SEO bad. Before SEO they were likely sending e-mail spams until that got too hard, but you don't unilaterally brand professionally-looking e-mails or people who sell mailing-list managers as evil, do you?

        As Google et al. get their acts in gear and revamp their algorithms, "SEO" is increasingly overlapping with "good site design" - this was always the intention, and even now "white-hat SEO" and "good site design" are pretty much synonymous.

        SEO isn't the problem - the problem is a combination of shithead black-hat SEOers, Search Engines inadequately assessing a page's worth and ill-educated types who shortsightedly blame the gun or bullet instead of the guy who fired it at them.
    • If you read the article, it's entirely from the perspective of someone trying to corrupt the rankings for financial gain.

      No, it's written from the perspective of helping people from inadvertently falling into a trap of being labeled a spammer. E.g. "If you are on a shared server it's possible somebody else on that server is using dirty tactics or Spaming. If so your site will suffer since you share the same IP."

      His conclusion on page 3: "Overall keep it ethical and you can't go wrong."

  • Woah, I'm a genious! [slashdot.org] ;-)

    "Am I correct in assuming that these sites pops up and down relatively often? Maybe it'd be possible to use temporal component to the rating. Say if the link points to a site which was just registered two days ago, it's given a very very low weight, and then you ramp up as time goes by."

  • by acostin ( 229653 ) on Thursday June 16, 2005 @07:15AM (#12831225) Homepage
    I always suspected this... When we've started our business, we used the domain www.interakt.ro [interakt.ro] (we're from Romania). However, because we sell software tools mostly to the USA and Western Europe, we've decided to go to www.interaktonline.com [interaktonline.com].

    Instantly, our ranking went from number one (for "Dreamweaver Php" for example, we were number one there instead of Macromedia itself a long time), to page 10.
    Now, we're working hard to promote our site, we have links all over the place, but still our site don't get up again to page 1 (search for "dreamweaver extensions" - we have to pay to get our site in the first position). I even thought that they do this on purpose for us to continue to pay on Google Ads :D

    Probably they say it too in the patent, but the best ranking tool is to use the right "title" tag in your pages. It's invaluable how well this scores as compared to the page content.

    Alexandru
  • by AndrewStephens ( 815287 ) on Thursday June 16, 2005 @07:16AM (#12831229) Homepage
    Nothing in the patent nullifies my pagerank defeating technique - put lots of links to my homepage [paradise.net.nz] in slashdot posts modded to +5 funny!
  • PageRank (Score:5, Informative)

    by Fermatprime ( 883412 ) on Thursday June 16, 2005 @07:16AM (#12831230)
    The article dedicates only a couple of paragraphs to PageRank, the main algorithm that Google uses, and about 2.5 pages to the rest. If anyone wants to know more about PageRank, here's Page and Brin's original paper: http://www-db.stanford.edu/~backrub/google.html [stanford.edu]
  • About the autor (Score:5, Insightful)

    by nietsch ( 112711 ) on Thursday June 16, 2005 @07:21AM (#12831260) Homepage Journal

    bout the Author
    How-to-make-money-online.info is a site focused on Making Money Online and Internet Marketing, listing the many and varied ways of making money online. Featuring, resources, thousands of Internet Marketing articles and useful links.

    This article comes with reprint rights. You are free to reprint and distribute it as you like. All that we ask is that you do not make any changes, that this resource text is include, and that the link above is intact.

    So that explains a lot. What a crappy article, I wonder if the submitter is the same as the Author?
  • by courtarro ( 786894 ) on Thursday June 16, 2005 @07:27AM (#12831288) Homepage
    Just to clarify, from the summary:

    one of the criteria that they use is the number of years that your site has been registered

    is not the same thing as (from the article):

    How many years did you register your domain name for?

    Though the summary suggests that older sites do better, the article is stating that, in order to improve one's Google ranking, domain owners should purchase longer domain registrations.

  • Impossible? Spyware? (Score:4, Interesting)

    by bogado ( 25959 ) <bogado&bogado,net> on Thursday June 16, 2005 @07:28AM (#12831303) Homepage Journal
    Some of the tatics detailed in the article require a spyware (google toolbar?). It is not possible for google to know when you came back to the search engine from your site, or another one (unless you have a link in your site to google). It also impossible for google to know if you have a bookmark.

    Google does have a click-through engine attached to the results, but many people find this in adition to the single identifier cookie that googles push into you abusive already.

    We all thing google is doing a good job, and it did managed to incorporate adds and an add service that is well accepted by the people. (I wonder why people still think it is a good idea to make blinking and noisy flash adds?) The point is how much we trust google? I personaly don't mind very much the click through, but do not accept the cookie and will not install a toolbar.
    • If you enable pagerank querying in the google toolbar, Google knows where you go.
  • Microsoft OLE DB Provider for SQL Server error '80004005'

    [DBNETLIB][ConnectionOpen
    (PreLoginHandshake()).]General network error. Check your network documentation.

    E:\WEB\BUZZLE\EDITORIALS\../common.asp, line 156

    Buzzle? Okay if this guy is a fan of Dr Dre or something I'm going to eat my own socks...
  • I'll continue to wonder because I'm getting an internal "Page Not Found" message after some weird Microsoft DB messages...

    Seems ironic.
  • more on the subject (Score:5, Informative)

    by muszek ( 882567 ) on Thursday June 16, 2005 @07:34AM (#12831335) Homepage
    The story is so old I can't believe it made it to slashdot.

    Some more on info the subject:
    1. U.S. Patent Application [uspto.gov] - it's best to read what's exactly been patented.
    2. interesting discussion on webmasterworld [webmasterworld.com]

    Personally I think that while some of the stuff is interesting, most of it is made up rather to confuse SEOs (google doesn't quite like them, you know that, right?). Before that, they had couple factors to think about and work on. Now, there's a shitload of stuff that just makes their work harder. Also, more factors influencing SERPS means it's much, much harder to make a trial-an-error research on what works well and what doesn't.
  • Spammers (Score:3, Insightful)

    by th1ckasabr1ck ( 752151 ) on Thursday June 16, 2005 @07:36AM (#12831350)
    Google's Site Ranking algorithms reveal how hard they are making it for spam sites to get listed (on Google). This information will also make it easier for you to make sure that you get listed well in Google.

    Won't this information now make it easier for spam sites to get listed?

    • Not if it's done properly -- much like encryption algorythms, so long as there aren't any loopholes then being open about it is neutral at worst and could possibly have some benefits.
  • How can Google claim a patent infringement if other companies are keeping their algorithms as secret as Google did?

    Their pagerank algorithm was one of the keys to their success. Keeping it secret was one of the things that made Google work and it was a good secret - nobody completely knew how it worked. So why patent it? What's the point?

  • by Moiche ( 840352 ) * on Thursday June 16, 2005 @07:51AM (#12831434)
    Jeez, news for nerds, and the story was a badly edited blurb referencing a badly edited blog that didn't reference the patent application.

    Just look at the patent application yourself [uspto.gov].

    I haven't read the whole thing, but just having taken a quick look at it, I have to agree with the posters who said that Google purposefully tried to cover any conceivable technique to index and rank pages. The application discusses multiple implementations of the various techniques that could be used to rank a page. Therefore analysis of the patent application is probably of limited utility for those trying to game PageRank (which was certainly a factor that Google's very competent IP lawyers considered before prosecuting the patent).

    For those who are worried that Google is doing evil with this patent application, given the breadth of the patent and the fact that it discusses a plethora of techniques which Google may or may not be using, I will be surprised to see Google try to use this patent (or be able to use this patent) to push another search engine out of the market. More likely, I think, is that this will constitute prior art to enable Google to withstand challenges from other patent applicants for infringement. Of course, if you know anything about PageRank, you know that it was getting published in Scientific American long before Google was the dominant search engine. So this patent application is probably more to prevent allegations that Google infringed by adding on all the other checks and balances to the original PageRank technology to discourage spam sites.

    Moiche

  • by slappyjack ( 196918 ) <slappyjack@gmail.com> on Thursday June 16, 2005 @08:14AM (#12831607) Homepage Journal

    There's a lot to take onboard here and consider. But you can't go far wrong with your SEO if you try to grow your site as organically as possible.

    If any of you have worked in a small online shops you know what a fucking holy war this is between marketing and pretty much everyone else. I specifically remember saying at one point, "Do we have to make ALL of the money RIGHT NOW?"

    Good for Google for coming forward and telling peole they won't be a part of that slimy shit.

    Bad for Google for saying all of this to drive up prices on their AdWord sales.
  • by PepeGSay ( 847429 ) on Thursday June 16, 2005 @08:29AM (#12831711)
    Remember, this is a patent which requires no working model. In other words, this could be how Google *envisions* their search working as much as it indentifies any of the things it does do.
  • by FunWithHeadlines ( 644929 ) on Thursday June 16, 2005 @09:14AM (#12832003) Homepage
    I started Fun With Headlines [funwithheadlines.net] a few years back, and with no advertising on my part I was suprised how quickly Google picked me up. Right now I'm about the 5th or 6th result when you search for "fun headlines" and (obviously) the 1st when you search for "fun with headlines." At times I have been the 1st for "fun headlines," and at other times I have been around 10th.

    OK, so there aren't that many sites like mine, let alone sites that update daily over a period of years and include their entire archive on the site that grows daily. On the other hand, to my knowledge from doing searches on Google, I have very few site that link to mine, and I thought that counted highly with Google. So basically without trying to game the system, let alone advertise my site (other than incidentally in comments like this), I've been treated really well by Google.

    In my case, it must be the longevity issue coupled with the scarcity of sites like mine. It sure ain't the links to my site.

  • So... (Score:3, Funny)

    by sootman ( 158191 ) on Thursday June 16, 2005 @09:15AM (#12832014) Homepage Journal
    ...that whole pigeon thing was a joke? I can't believe it. Maybe this filing just a way to divert our attention?
  • If your site has been registered for less than a year, then it counts against you.

    Interesting. This means that registering domain names as soon as I think of them, even though I tend to not get around to actually building the site for them for a while, is to my advantage (and not just for the sake of securing the name). I have one domain that I registered four years ago, but didn't have time to put anything more than a simple placeholder site on it until now. Now that I finally have it going, Google ma

  • I could barely get through that article because it was so horribly written.

    "As well as the number, quality and anchor text factors of a link."

    WFT kind of sentence is that!?

  • I'm evil and want my small business competitor to drop in the rankings.

    I set up a link-exchange farm and make sure he's listed prominently.

    POOF he's branded a spammer.
  • by NanoGator ( 522640 ) on Thursday June 16, 2005 @10:37AM (#12832607) Homepage Journal
    Alrighty folks, you know the drill! Google filed a patent, ready pitchforks!!

  • by Animats ( 122034 ) on Thursday June 16, 2005 @11:52AM (#12833235) Homepage
    A stronger approach would be to find out who really owns the site.

    For example, let's search Google for "london hotels", a common search phrase. The first return is LondonNights.com [londonnights.com]. "Whois" returns "Worldview Ltd, 16 Marine Road West, Morecambe, LA3 1BS, Lancs, GREAT BRITAIN (UK)."

    That's a UK company, so we look it up at Companies House. [companieshouse.gov.uk], where we find "WORLDVIEW LIMITED, 16 MARINE ROAD WEST, MORECAMBE, LANCASHIRE LA3 1BS, Company No. 04588973". So we have a match on a registered company.

    We check further with Dun and Bradstreet [dnb.com], which has a worldwide database of companies. We find "WORLDVIEW LTD 16 MARINE RD WEST MORECAMBE , UK Type of Location: single"

    So they pass company validation, and we can get financial information about them.

    Now let's try a domain that just appeared in a spam: "fleagroups.com". "Whois" gives us "Flea Market Groups. 126 73rd Ave N., Coral Springs, Florida 34992. US" So we go to Sunbiz, the Florida State Division of Corporations [sunbiz.org], and search. No "Flea Market Groups" under fictitions names. No match on address under anything beginning with "Flea". No "Flea Market Groups" under corporations, and no "Flea Market *" address matches.

    Looking in Dun and Bradstreet, there are "Flea Market *" hits, but no exact match and no address match.

    So they fail company validation. Add to probable spammer list, drop search engine ranking.

    This is a reasonable test for any site that appears to be selling something.

  • Links from blogs.
  • by Vadim Makarov ( 529622 ) <makarov@vad1.com> on Thursday June 16, 2005 @03:55PM (#12835763) Homepage
    For example, one of the criteria that they use is the number of years that your site has been registered. If your site has been registered for less than a year, then it counts against you.

    So I get the following:

    Date: 2 Jun 2005 11:42:45 -0000
    From: Bettina Jensen <bdomains@itmarketinggroup.com>
    To: makarov@vad1.com
    Subject: [#17922] Buying your domain: vad1.com

    Dear Webmaster

    I am interested in buying your domain vad1.com for $400.
    I'm only interested in the domain not in your content, so
    you can sell your domain and move your content to another domain.
    If you are interested please respond to this e-mail.

    Regards,

    Bettina Jensen
  • cat got my tongue (Score:3, Insightful)

    by anthony_dipierro ( 543308 ) on Thursday June 16, 2005 @06:56PM (#12837279) Journal

    Google's Site Ranking algorithms reveal how hard they are making it for spam sites to get listed (on Google).

    And provides a list of techniques for spam sites to use that guarantee them positions on every search engine but Google (in fact, if you use these techniques it's illegal for other search engines to penalize you for them.

    This could be an especially evil technique for spammers.

    1. Set up search engine.
    2. Build some spam sites using search engine optimization techniques.
    3. Modify your search engine to penalize people using your optimization techniques.
    4. Get a patent.
    5. Profit, either from your increased search results in Google, or from suing Google for violating your patent.

"Marriage is low down, but you spend the rest of your life paying for it." -- Baskins

Working...