Google's Site Ranking Secrets 309
vivin writes "Ever wonder how Google's site ranking works? Wonder no more. Google recently filed United States Patent Application 20050071741 on March 31, 2005. This patent reveals a great deal of information about Google's site ranking algorithm and makes very good reading. For example, one of the criteria that they use is the number of years that your site has been registered. If your site has been registered for less than a year, then it counts against you. A site registered for a longer period of time means that the owner is probably serious about the site, and the site is probably legitimate. Google's Site Ranking algorithms reveal how hard they are making it for spam sites to get listed (on Google). This information will also make it easier for you to make sure that you get listed well in Google."
Note (Score:5, Insightful)
Re:Note (Score:2)
It was just a note so people won't go "wow, so Google is doing this or that", when they in reality may not be.
It has nothing to do with an anti-MS bias or whatever you seem to be implying.
Re:Note (Score:2)
Re:Note (Score:2)
Re:Note (Score:4, Insightful)
Uhhh...to prevent others from benefiting from it? That's what patents are for. They say it there to promote innovation. It protects the owners exclusive control in the hope that he might reveal his idea to the world. More often than not, what really happens is that the owner will put the invention on the shelf because a)it competes with other inventions the owner may have on the market, or b)like a land or commodities speculator, he's holding out for an exorbitant price. Note that also more often than not, the owner of the IP privileges is not the creator. Patents are bought and sold like poker chips. While the actual device rots. Only the paper pushers benefit.
what are they trying to protect then?
Their advantage over everybody else.
Patents for Potential (Score:3, Insightful)
Kind of reminds me of a science fiction story I read as a kid... this engineer is walking down the road when he sees a
or (Score:3, Insightful)
Re:or (Score:2)
Real Explanation (Score:5, Funny)
http://www.google.com/technology/pigeonrank.html [google.com]
Re:Real Explanation (Score:3, Funny)
I'm currently under negotiations with Google to see if I can use their massive ammounts of Pidgeon Clusters and a few statues I have handy to do some studies on the dynamics of white dialetric material.
I'll probably just end up bullshitting the answers instead.
Speed of gaining links? (Score:4, Insightful)
Re:Speed of gaining links? (Score:5, Insightful)
However, if you have gotten 1000 links at once, and for the next months noone else is linking to you - then you have probably bought the initial links, but nobody real considers the content worthy of attention.
Re:Speed of gaining links? (Score:2)
Re:Speed of gaining links? (Score:3, Interesting)
So it's only gained one link through
Re:Speed of gaining links? (Score:3, Insightful)
We've had a couple of slashdotted articles, and the logs have shown visitors coming in from thousands of new links. Although for whatever reason, Google picked up on almost none of them...
Spammers killing Google (Score:5, Interesting)
Sometimes when I search for something specific, I get a bunch of useless links that have results of other "search engines" that invariably show something similar to "0 results for your search terms 'sheep+barn+slashbot+erotica'"
How do these sites get on the first page of Google results?
Re:Spammers killing Google (Score:2, Insightful)
Re:Spammers killing Google (Score:5, Interesting)
Re:Spammers killing Google (Score:2)
Or, instead just change the browser tags of GoogleBot to random but human identifyable names to prevent sites from displaying only to googlebot.
Instead of GoogleBot 2.0, it is "Google - Not Firefox - Bot 2.0" or "Internet Explorer via GBot v2". It would at least deter the sites that try to cheat google, and it would still identify to webmasters GoogleBot's traffic.
But with all the spam sites, I dont see why
Re:Spammers killing Google (Score:2)
It might be against the Robots Exclusion Standards [robotstxt.org] to deliberately fake your UserAgent header, but that's mostly so you can contact the robot's owner if it goes wrong and accidentally DOSes your site.
I doubt severely anyone would mind if Google did an occasional, low-impact, slow, back-up crawl disguised as IE (presumably also from an IP address block not known to belong to Google), especially since GoogleBot has only
Re:Spammers killing Google (Score:2)
The page is still not available on Google, and I don't even care about ranking high. Every other search engine has found the site except Google.
Ways to get linked from a high-PR site (Score:2)
I have a website with a robot.txt begging for the Googlespiderbot to come by, and it still hasn't.
Have you got the site listed in one of the major directories (Yahoo, Dmoz, etc)? Have you tried setting it as your Homepage in your Slashdot preferences [slashdot.org]?
Doesn't work, see explanation (Score:5, Interesting)
A while back I proposed a distributed approach like this in the Nutch mailing list [mail-archive.com]. The problem is that it would be hard to implement and it may not be worth the effort, since there are cheaper ways to fight spam.
Re:Spammers killing Google (Score:5, Informative)
They get to the top through link spamming, 302 hijacks, "scaping" content from other sites, search engine optimisation etc etc etc.
They are sites "made for adsense" as its called, whereby they exist for the sole purpose to be highly ranked in google and get ad clicks from people looking for something else. Effectively 'doorway' pages, which make a shitload of money, as people that land on such pages don't find what they really want, so click through on the ads in hopes of finding it there instead.
The crap of the internet, many hundreds of thousands of such sites run by only a hanful of thousand very rich people.
Re:Spammers killing Google (Score:5, Insightful)
Re:Spammers killing Google (Score:2, Interesting)
Re:Spammers killing Google (Score:5, Funny)
Hey, there's a help button... *clicks*... Oh God...
Re:Spammers killing Google (Score:3, Insightful)
It's bad web design, plain and simple.
Re:Spammers killing Google (Score:2)
Mailing list archives are almost always exactly what I was looking for. They are pure information, no marketing fluff like the official pages for a product often are. The problem is that some mailing lists have a gazillion mirrors, so if the first hit isn't exactly what you were looking for, you have to flick through 10 pages of exactly the same result before you get to something else that might solve your problem. A voluntary standard for mirrors of mail
Re:Spammers killing Google (Score:2)
1. Do-it-yourself search site (doesn't matter if it uses google, just put it there)
2. Store in a database the most used search terms
3. Produce a list of links with query strings that contain the search terms
4. Wait for google to index the page and crawl it
5. ???
6. Profit!!
Re:Spammers killing Google (Score:2)
Its a patent... and a laundry list... (Score:5, Insightful)
Some of these techniques are just plain old bizzare and might be way too difficult to approach algorithmically.
Oh well
Re:Its a patent... and a laundry list... (Score:2)
In order for a patent claim to be valid, a prototype must exist. Perhaps the most important word here is claim. There are two main parts of a patent: a description which is provided for purposes of explanation, and the claims which are the really important part. It's not uncommon for several patents to use the same description; last I knew, my previous employer was pursuing four patents on
Re:IMNAPL (Score:4, Informative)
Re:Its a patent... and a laundry list... (Score:2)
You're right. Google prides itself on "do no evil".
BUT. Software patents are IMHO evil, Google just got one, therfore, Google did evil. I still like google, but if Microsoft patents are bad so are Google ones.
And one may argue that some of googles gmail terms weren't the most "nice", but I didn't really have a problem with them. I do with this.
Start Goodle ranking improvement business (Score:2, Funny)
Step 2: Go 5 years into past, buy domain names, set up sites with lots of soft porn images
Step 3: Return to present, stopping off each year on the way to renew domains. Step 4: Sell to spammers etc.
Step 5: Profit.
I'm open to venture capitalists for investment in this one.
Re:Start Goodle ranking improvement business (Score:2)
Step 2: Go back to 1994 and register Google.com (and Google.net and Google.org) before Google does.
Step 3: Offer to sell the domains to them for 0.25% ownership of the company, and 0.5% of the stock to be issued in any "hypothetical" "future" IPO [yahoo.com]; this should be small enough they'll cough up without hesitating.
Step 4: Pop back to 1977 and pick up 100 shares of Berkshire Hathaway while you're about it.
Step 5: Profit!
Think Big. Win Small. --Darius Regulo, the King of Heaven
Re:Start Goodle ranking improvement business (Score:2)
Besides, it seems like a lot of work when you already have a time machine. An easier solution would be:
1. Find an event you can bet on with a huge payout.
2. Go back in time and bet as much as you can.
3. Go to step 2 until you're sick of money.
SEOs make me barf (Score:5, Insightful)
Re:SEOs make me barf (Score:5, Interesting)
Are you serious? (Score:3, Informative)
Submitting the site to Google is a negative in their algorithm. Back when I had therabbithole.redback.inficad.com for my domain name Google found my site within a month.
You can't be successful in a
Re:SEOs make me barf (Score:2)
Who says your site is the "best tech site ever"? What if I decide that *my* site is the "best tech site ever" and game the system to bump out yours?
The site that has truely good content and that has been around longer should be ranked highest. Trying to manipulate your page rank using any other means is a lit
Re:SEOs make me barf (Score:2)
Which is precisely what pagerank, in essence, was designed to prevent. YOU can't make it popular, other people have to choose to do it by deciding your site is interesting and by linking it.
As such, Google only reflects your popularity. If everyone could manipulate their results to "get it popular" on Google, then no one's site would be.
As to getting yours popular, write articles, participate in blogs, get /.'ed.
(Of course, if you have no money for advertising then you
Re:SEOs make me barf (Score:2)
They're worse than lawyers. With a lawyer you at least *know* you're going to get ripped off.
They're more like chiropractors. The only reason you need one is that you don't want to do the work (building a good site/product or living a healthy lifestyle).
Not that simple (Score:5, Insightful)
No, its not that simple. Lets say I have a small business, I sell garden tools, lawnmowers,etc, in a certain region. And yet I do a search on google for garden tools + region, I am nowhere to be found. What do I do? I optimise the hell out of my site, caking it with region name + garden tools information, and I set up a links exchange program, getting in links left right and centre from related sites. This is SEO, and it will only affect people that enter a search for "garden" "tools" "my region". In other words, those that actually want to find my site.
Theres a distinction between SEO and spamming; if I was to optimise for a garden tools site and set up a poker site there, that would be spamming.
Re:Not that simple (Score:4, Insightful)
The way I see it, SEO is a tool - nothing more, nothing less. It isn't inherently evil or inherently good - it's how you use it and what you use it for that matters.
If you've got a good site on... i dunno... aardvark polishing for fun and profit, then you should rank highly on Google. If you don't rank well on Google, it's probably because your site is lacking one of fame, content or clean code. All of these are necessary for (or inevitable side-products of) a good site that does what people want.
Conversely, a good site will probably have many inbound links, clean semantic markup, well-focused pages full of good content and so on. This is simply good site design (or, like the links, a side-effect of it), but it's also the very ethical end of the SEO spectrum.
Now, you also get evil scumbag fuckwits-for-hire who specialise in link-farming, keyword stuffing, cloaking and other black-hat techniques, and sell their services to shitty pr0n or spam sites. This is spam - no doubt about it - but it only represents the black-hat side of SEO.
The black-hat SEOers, it must be admitted, are the one which gets all the attention. They're the ones advertising like mad, making overblown claims, spamming search engines with crap listings and generally getting in people's faces. However, just because these people use SEO doesn't make SEO bad. Before SEO they were likely sending e-mail spams until that got too hard, but you don't unilaterally brand professionally-looking e-mails or people who sell mailing-list managers as evil, do you?
As Google et al. get their acts in gear and revamp their algorithms, "SEO" is increasingly overlapping with "good site design" - this was always the intention, and even now "white-hat SEO" and "good site design" are pretty much synonymous.
SEO isn't the problem - the problem is a combination of shithead black-hat SEOers, Search Engines inadequately assessing a page's worth and ill-educated types who shortsightedly blame the gun or bullet instead of the guy who fired it at them.
Re:SEOs make me barf (Score:2)
No, it's written from the perspective of helping people from inadvertently falling into a trap of being labeled a spammer. E.g. "If you are on a shared server it's possible somebody else on that server is using dirty tactics or Spaming. If so your site will suffer since you share the same IP."
His conclusion on page 3: "Overall keep it ethical and you can't go wrong."
Re:SEOs make me barf (Score:2, Insightful)
That is the opinion of an SEO douche bag. In the real world, users of Google want Google to figure out which one has the edge, not some half assed marketing dink with knowledge of HTML.
I can look at a website and in 5 seconds detirmine if it has employed SEO techniques
So can we. When result #1 is less useful than result #5, we know an SEO asshat has been employed.
That i
Re:SEOs make me barf (Score:2, Insightful)
Google doesn't have the right to dictate which business is better. Google has a right to determine through their algorithm which site THEY think is better. As a searcher, by selecting Google as MY search tool, I am saying "I think Google's suggestions are the best". No one forces anyone to use Google, people choose to use it, because it work
Dude. You're just not getting it. (Score:3, Insightful)
Temporal data. (Score:2)
Woah, I'm a genious! [slashdot.org] ;-)
"Am I correct in assuming that these sites pops up and down relatively often? Maybe it'd be possible to use temporal component to the rating. Say if the link points to a site which was just registered two days ago, it's given a very very low weight, and then you ramp up as time goes by."
Re:Temporal data. (Score:5, Funny)
Re:Temporal data. (Score:2)
I thought so .. changed my site from .ro to .com (Score:5, Interesting)
Instantly, our ranking went from number one (for "Dreamweaver Php" for example, we were number one there instead of Macromedia itself a long time), to page 10.
Now, we're working hard to promote our site, we have links all over the place, but still our site don't get up again to page 1 (search for "dreamweaver extensions" - we have to pay to get our site in the first position). I even thought that they do this on purpose for us to continue to pay on Google Ads
Probably they say it too in the patent, but the best ranking tool is to use the right "title" tag in your pages. It's invaluable how well this scores as compared to the page content.
Alexandru
Re:I thought so .. changed my site from .ro to .co (Score:5, Informative)
Alexandru
Re:I thought so .. changed my site from .ro to .co (Score:3, Informative)
Foiled Again Google! (Score:4, Funny)
Re:Foiled Again Google! (Score:2)
PageRank (Score:5, Informative)
About the autor (Score:5, Insightful)
So that explains a lot. What a crappy article, I wonder if the submitter is the same as the Author?
Re:About the autor (Score:2)
Re:About the autor (Score:2)
However, I'm sure his business has gotten a big leap in credibility through the link in Slashdot. (Links in comments don't get anyone anywhere, but I believe the links from the main page do).
D
Re:About the autor (Score:3, Interesting)
Registration Age vs. Registration Duration (Score:5, Informative)
one of the criteria that they use is the number of years that your site has been registered
is not the same thing as (from the article):
How many years did you register your domain name for?
Though the summary suggests that older sites do better, the article is stating that, in order to improve one's Google ranking, domain owners should purchase longer domain registrations.
Impossible? Spyware? (Score:4, Interesting)
Google does have a click-through engine attached to the results, but many people find this in adition to the single identifier cookie that googles push into you abusive already.
We all thing google is doing a good job, and it did managed to incorporate adds and an add service that is well accepted by the people. (I wonder why people still think it is a good idea to make blinking and noisy flash adds?) The point is how much we trust google? I personaly don't mind very much the click through, but do not accept the cookie and will not install a toolbar.
Re:Impossible? Spyware? (Score:2)
Re:Impossible? Spyware? (Score:2)
search ->
see the first page for less then 1sec ->
see the second page for less then 1sec ->
see the third page for a long time (that you spended reading the first page)
all of the euristics of timing is noisy. You can click a link and your phone ring and then discover that you just found o
Snizzle my Buzzle? (Score:2)
[DBNETLIB][ConnectionOpen
(PreLoginHandshake()).]General network error. Check your network documentation.
E:\WEB\BUZZLE\EDITORIALS\../common.asp, line 156
Buzzle? Okay if this guy is a fan of Dr Dre or something I'm going to eat my own socks...
Page fragged (Score:2)
Seems ironic.
more on the subject (Score:5, Informative)
Some more on info the subject:
1. U.S. Patent Application [uspto.gov] - it's best to read what's exactly been patented.
2. interesting discussion on webmasterworld [webmasterworld.com]
Personally I think that while some of the stuff is interesting, most of it is made up rather to confuse SEOs (google doesn't quite like them, you know that, right?). Before that, they had couple factors to think about and work on. Now, there's a shitload of stuff that just makes their work harder. Also, more factors influencing SERPS means it's much, much harder to make a trial-an-error research on what works well and what doesn't.
Spammers (Score:3, Insightful)
Won't this information now make it easier for spam sites to get listed?
Re:Spammers (Score:2)
Why did Google do this? (Score:2, Interesting)
Their pagerank algorithm was one of the keys to their success. Keeping it secret was one of the things that made Google work and it was a good secret - nobody completely knew how it worked. So why patent it? What's the point?
Re:Why did Google do this? (Score:2)
Here's a good ol' Wikipedia article [wikipedia.org] which has had the formulas for it on there for some time now.
Just look at the patent application (Score:4, Informative)
Just look at the patent application yourself [uspto.gov].
I haven't read the whole thing, but just having taken a quick look at it, I have to agree with the posters who said that Google purposefully tried to cover any conceivable technique to index and rank pages. The application discusses multiple implementations of the various techniques that could be used to rank a page. Therefore analysis of the patent application is probably of limited utility for those trying to game PageRank (which was certainly a factor that Google's very competent IP lawyers considered before prosecuting the patent).
For those who are worried that Google is doing evil with this patent application, given the breadth of the patent and the fact that it discusses a plethora of techniques which Google may or may not be using, I will be surprised to see Google try to use this patent (or be able to use this patent) to push another search engine out of the market. More likely, I think, is that this will constitute prior art to enable Google to withstand challenges from other patent applicants for infringement. Of course, if you know anything about PageRank, you know that it was getting published in Scientific American long before Google was the dominant search engine. So this patent application is probably more to prevent allegations that Google infringed by adding on all the other checks and balances to the original PageRank technology to discourage spam sites.
Moiche
I find this quote funny: (Score:3, Insightful)
If any of you have worked in a small online shops you know what a fucking holy war this is between marketing and pretty much everyone else. I specifically remember saying at one point, "Do we have to make ALL of the money RIGHT NOW?"
Good for Google for coming forward and telling peole they won't be a part of that slimy shit.
Bad for Google for saying all of this to drive up prices on their AdWord sales.
Doesn't mean it really works this way (Score:4, Interesting)
My experience with FunWithHeadlines.net (Score:3, Interesting)
OK, so there aren't that many sites like mine, let alone sites that update daily over a period of years and include their entire archive on the site that grows daily. On the other hand, to my knowledge from doing searches on Google, I have very few site that link to mine, and I thought that counted highly with Google. So basically without trying to game the system, let alone advertise my site (other than incidentally in comments like this), I've been treated really well by Google.
In my case, it must be the longevity issue coupled with the scarcity of sites like mine. It sure ain't the links to my site.
So... (Score:3, Funny)
not exactly squatting, but... (Score:2)
Interesting. This means that registering domain names as soon as I think of them, even though I tend to not get around to actually building the site for them for a while, is to my advantage (and not just for the sake of securing the name). I have one domain that I registered four years ago, but didn't have time to put anything more than a simple placeholder site on it until now. Now that I finally have it going, Google ma
Horrible Writing (Score:2)
I could barely get through that article because it was so horribly written.
"As well as the number, quality and anchor text factors of a link."
WFT kind of sentence is that!?
So now I can make my foe look like a spammer? (Score:2, Interesting)
I set up a link-exchange farm and make sure he's listed prominently.
POOF he's branded a spammer.
Uh oh, Google applied for a patent! (Score:3, Funny)
Other, better approaches to search engine spam (Score:3, Insightful)
For example, let's search Google for "london hotels", a common search phrase. The first return is LondonNights.com [londonnights.com]. "Whois" returns "Worldview Ltd, 16 Marine Road West, Morecambe, LA3 1BS, Lancs, GREAT BRITAIN (UK)."
That's a UK company, so we look it up at Companies House. [companieshouse.gov.uk], where we find "WORLDVIEW LIMITED, 16 MARINE ROAD WEST, MORECAMBE, LANCASHIRE LA3 1BS, Company No. 04588973". So we have a match on a registered company.
We check further with Dun and Bradstreet [dnb.com], which has a worldwide database of companies. We find "WORLDVIEW LTD 16 MARINE RD WEST MORECAMBE , UK Type of Location: single"
So they pass company validation, and we can get financial information about them.
Now let's try a domain that just appeared in a spam: "fleagroups.com". "Whois" gives us "Flea Market Groups. 126 73rd Ave N., Coral Springs, Florida 34992. US" So we go to Sunbiz, the Florida State Division of Corporations [sunbiz.org], and search. No "Flea Market Groups" under fictitions names. No match on address under anything beginning with "Flea". No "Flea Market Groups" under corporations, and no "Flea Market *" address matches.
Looking in Dun and Bradstreet, there are "Flea Market *" hits, but no exact match and no address match.
So they fail company validation. Add to probable spammer list, drop search engine ranking.
This is a reasonable test for any site that appears to be selling something.
Hole in the algorithm (Score:2)
Number of years the site has been registered (Score:3, Interesting)
So I get the following:
cat got my tongue (Score:3, Insightful)
Google's Site Ranking algorithms reveal how hard they are making it for spam sites to get listed (on Google).
And provides a list of techniques for spam sites to use that guarantee them positions on every search engine but Google (in fact, if you use these techniques it's illegal for other search engines to penalize you for them.
This could be an especially evil technique for spammers.
Re:ATTENTION (Score:4, Funny)
I want to change. Please help me--I don't think I can do it on my own.
Re:Speeling? (Score:2, Insightful)
Re:New Startups (Score:2)
When google.stanford.edu was a startup, the web was still in nappies, and spam wasn't such a problem.
Re:New Startups (Score:2, Insightful)
Re:Yay For Patents (Score:3, Insightful)
At the moment, the system is horribly abused, but the basic principle is a good one. I would be completely in favour of software patents if:
Re:What do editors do? (Score:2)
Please slap some downmods on this one too!
Re:What do editors do? (Score:2)
Re:What do editors do? (Score:2)
Re:Please stop that hover red text thing!!! (Score:2)
I once worked at a site and we had to take a lot of the nice looking CSS out becuase people werent smart enough to know what was a link and what wasn't.
I should not have to analyze you site to determine where the links are. It should be stunningly obvious. If people are complaining that they can not find the links, you have designed your site wrong.
Then again, our customer base was so dumb one guy actually faxed us to ask for out phone number
Re:Meta ratings (Score:2)
I like the idea, but I think there are a couple of reasons why Google hasn't done it:
Re:Google vs Altavista (Score:2)
Re:Already ./'ed ? (Score:2)
Re:first it was people using (Score:2)
That's not really much of a neologism, though. 'Hey, this CD's really good - you should have a listen some time.' I've heard 'listen' being used as a noun in that sense for years.