Stories
Slash Boxes
Comments

News for nerds, stuff that matters

Slashdot Log In

Log In

Create Account  |  Retrieve Password

Alexa, Amazon's Most Flawed Idea

Posted by CmdrTaco on Thu Oct 19, 2006 11:19 AM
from the ain't-that-the-truth dept.
Rub3X writes "The Alexa ranking system is naturally flawed. The data should never be treated as accurate, as it's easily manipulated, and not supported for most browsers in the world. It's an estimate, and nothing more. " I've been saying that forever, but unfortunately for me, since it's a number on a website that is considered "Real" to some, I'm supposed to take it seriously. I imagine this is a problem for many webmasters out there.
+ -
story
This discussion has been archived. No new comments can be posted.
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
 Full
 Abbreviated
 Hidden
More
Loading... please wait.
  • by in2mind (988476) on Thursday October 19 2006, @11:27AM (#16503135) Homepage
    Services like Megaupload.com force Non-American/Non-european users to install Alexa toolbar to download the file.

    That explains why Alexa has file-upload sites such as Megaupload,rapidshare in the top 10 sites of most countries...
  • Error in article (Score:4, Insightful)

    by tont0r (868535) on Thursday October 19 2006, @11:27AM (#16503151)
    According to the article:
    "Alexa has no support for FireFox, Opera or Safari at all. "

    According to Alexa's Wiki:
    "Users running any browser except Internet Explorer and Mozilla Firefox are not represented. Thus users of Opera, Safari, mobile phone (WAP) browsers are all ignored. Nevertheless, this is still the vast majority of the browser market."

    So its half right :P
    • Re: (Score:3, Informative)

      According to this [alexa.com], there is no FF version. There are third party plugins for monitoring only.
    • Re: (Score:3, Interesting)

      I don't see anywhere where you can actually download a toolbar for Firefox - not that I would if they had one.

      That's the other problem with Alexa. It doesn't include clueful users. Of course, they aren't statistically significant.
  • by technoextreme (885694) on Thursday October 19 2006, @11:27AM (#16503155)
    I've pointed this out before. There are weird statistical anomolies that should show that Alexa's webratings are not perfect. Take a look at this data for Slashdot and Digg. The traffic ratings both shoot up withing a s short amount of time. It just doesn't make much sense. http://www.alexa.com/data/details/traffic_details? &range=2y&size=medium&compare_sites=www.digg.com&y =r&url=www.slashdot.org#top [alexa.com]
    • Re: (Score:2, Insightful)

      by Anonymous Coward
      I wonder would the obvious spike in users at digg & /. be due to the introuction of an alexa plugin for mozilla firefox at that time (May 2006)?

      www.stevecastle.org [stevecastle.org]

      Just askin'...
      • by jamie (78724) * <jamie@slashdot.org> on Thursday October 19 2006, @12:07PM (#16503785) Homepage Journal

        If so, it kind of makes the case that Alexa data is less than useful.

        But that's not all that's going on. In Nov-Dec 2005 it shows Slashdot's traffic roughly tripling, then settling down to roughly double its previous level, in the space of about a month. I have our traffic logs from that time. They were basically flat. All of the variance was Alexa anomalies.

        • Re: (Score:3, Insightful)

          All of the variance was Alexa anomalies.
          In the past three years, the one Slashdot article with Alexa in the title was in December 2005. No doubt a few slashdotters took a quick look at the toolbar, and the quickly decided it was worthless.
        • f so, it kind of makes the case that Alexa data is less than useful.

          It's not "less than useful".

          In fact, this is both a completely obvious and a completely stupid article submission. The "duh" tag is appropriate, both because none of the current ranking/statistics systems are accurate, and because despite that, they are still useful.

          When you're looking at numbers like total reach, or you're comparing one web site with another, nobody needs statistics that are 100% accurate. I don't need to know if CNN has 4 million unique visitors per day or 4,409,765 unique visitors per day. You're using these services to get a general idea. If I'm running a web site, for example, I know what my own stats are - I don't need Alexa to tell me. But I can still use Alexa to tell me the basic gist of a competitor, and if they're not as accurate as internal stats would be, what does that matter?

          Moreover, Alexa's stats are no more or less accurate (or easy to manipulate) than those of major organizations like Nielsen. The fact of the matter is any system that's not using actual server logs is going to have some inaccuracies (and if you think otherwise, then you've just bought into marketing spin). You live with it and accept it. The main difference is that Alexa is free, whereas other stat compilers charge thousands of dollars per year.
        • Re: (Score:3, Interesting)

          by Anonymous Coward
          I remember about 5 years ago we were making a new website and it wasn't launched yet, but it was publicly visible. About 5 of us had the Alexa toolbar installed and we were spending a bit of time on the site checking it and making sure links worked etc. A couple of weeks in to it - before we had even launched the site we had an Alexa ranking of around 60,000.

          Honestly Alexa must know it's a big joke. It still annoying to see it given any relevance.
    • Actually it makes perfect sense if that is when Alexa added support for Firefox, both of which are heavily used on Slashdot and Digg...

      As for Digg taking over Slashdot... well, maybe there are less punks on Digg. That would make me switch. Other than the punk factor, it must just be viral marketing.

      • Less punks on Digg? You must be new THERE. I can't even stand to go there anymore. Digg's moderation system is a joke, as is the comment system (1 nest) and the articles are subpar, even by /. standards.

        At least on /. when the eds screw up and post something bogus, they TELL you they messed up. Digg editors edit comments, falsely up "diggs" to move something to the front page, etc. It is a fraud of a blog.
        • I must admit I don't have an account at Digg and have never read through the comments. I read the headlines and go straight to the stories.

          I have work to do, and 2 slashdot-like distractions would put me out of business!

    • Re: (Score:2, Insightful)

      I think they changed to a different statistical model at that point, there are a ton of sites that make a jump on that same date. It is a good thing that they continually refine what they have (because it is FAR FAR FAR FAR from perfect) but they should have a little asterisk there letting people know what happened.
  • Duh (Score:4, Interesting)

    by UbuntuDupe (970646) on Thursday October 19 2006, @11:28AM (#16503173) Journal
    I remember for a while LewRockwell.com, which promoted alexa for its readers, was top-500, beating out worldnetdaily.com and gamefaqs.com. Now, nothing against LewRockwell.com, and it is indeed surprisingly popular, but there's no way in hell it's a top 500 site.
  • by truthsearch (249536) on Thursday October 19 2006, @11:29AM (#16503193) Homepage Journal
    Everyone who owns or develops web sites knows this. Anyone who hints in a forum the numbers may be accurate immediately gets slapped down. It's the non-technical advertisers who don't know this. And they're the only ones who care about this ranking in order to gauge how much to spend on purchasing web site advertising. Since almost no web sites publicly display traffic info advertisers find Alexa rankings very convenient and probably just don't understand why they'd be useless.

    Until advertisers "get it" or a much more accurate public metric is made available, Alexa rankings will unfortunately matter to web sites that are supported by advertising.
  • Alexa also doesn't calculate masked domains.

    I have a blogger blog that is masked with my own domain name.
    • From the tech end there's no such thing as a "masked" domain. What usually goes on behind the scenes, is the domain provider hosts a small HTML file at your actual domain. That file loads your real site in a child IFrame, [wikipedia.org] which is set to the full size of your browser window, while the parent frame is set to a non-visible size. So you don't see the parent frame (the real, technical contents of whatever your domain name is) but you see its URL in the address bar, since it is the parent frame, and remains s
        • Unless the toolbar goes off of what you type into the address bar rater than the urls that are actually loaded, would I would consider more likely. Do they say how they do it?

          I don't know, but off the top of my head I doubt page-ranking services would count other sites loaded in an IFrame. Otherwise I could create one of those useless domain-squatting pages that just exist to throw ads at people who click or type wrongly, load a bunch of actually useful, respected sites in IFrames, and use all that content

  • Yep - Alexa's sample size is pretty small, but skewed toward IE folks and people who install/game it so their web sites rank well ... although I think anything in the top-1000 is almost always "something" significant. Unfortunately, there aren't a lotta metrics out there (plus those often provide varying results), and this is easy to understand and free, so it's often used.

    Best source is the source - i.e. would be real interesting to know what the web stats (for actual web logs) are like for a site like
  • > not supported for most browsers in the world

    You heard it here first, folks: IE and Firefox make up only a small minority of web browsers in use.

    Star-based rating systems are useless for more than getting a quick idea of what's up. They don't really tell you anything; for instance, I've purchased items in the past that have issues that don't bother me that I would have passed on just based on a "star" approach.

    This goes for Alexa, this goes for movies, etc. I suspect that most consumers of this
  • Alexa also rewards webmasters who write badly broken IE only webpages, forcing people who normally use Firefox to switch over to IE for that webpage.
  • by logicnazi (169418) <logicnazi@nospAM.gmail.com> on Thursday October 19 2006, @11:48AM (#16503487) Homepage
    Now it's clear that the rankings from this system are heavily skewed and misses a substantial portion of the user base.

    This suggests it is useless as a way to estimate how much to pay for advertising on a web site (though since this is usually per click/per display I don't see why ranking matters here). However, it doesn't show that this data can't be usefull for other things. For instance it could be quite usefull to know what other sites the users (or IE users) of a site visit.

    In other words the data seems useless for any statistical analysis but it could be quite helpful to know what sorts of users visit a site. Sure slashdot's traffic might be underrepresented but I bet you the data still show that slashdot users are quite likely to go browse gadget purchase sites or programming related sites. If you want to know where to advertise your new fancy gadget or a fancy new programming enviornment that would be very usefull information even if it wouldn't support a rigorous statistical analysis.
    • One of the things that most Internet marketers miss (myself included at times) is that as you move up the food chain, there are more buyers. If I am pushing a brand-less product, I focus on my CPC rate, conversion rate, etc., and don't care where the ad runs, only the conversion rate there and what I am paying per click.

      However, the big boys (Ford Motor Company, Warner Bros. Television, etc.) focus on brand building and budgets. They don't ask if they are making money off the impressions, they have a quar
  • BZZT. (Score:5, Insightful)

    by Rob T Firefly (844560) on Thursday October 19 2006, @11:51AM (#16503543) Homepage Journal
    One fact TFA and the Slashdot title both got wrong, is Alexa wasn't Amazon's idea. Until Amazon bought it in 1999, Alexa was the commercial offshoot of archive.org [archive.org] for three years. Alexa is still what gives the Wayback Machine its web crawls.
  • For Firefox, you can use the SearchStatus extension (download from either the Firefox add-ons page [mozilla.org] or their home page [quirk.biz]). It's actually a somewhat useful tool also, in that it displays Google PageRank and Alexa rank for each site you visit, and has a few decent tools for showing various search engine related information for a given page. It also feeds data on every page you visit to Alexa as a byproduct of looking up their Alexa rank, which may be a positive or a negative for you. I personally verified tha
  • Real? (Score:3, Funny)

    by Poromenos1 (830658) on Thursday October 19 2006, @12:03PM (#16503721) Homepage
    it's a number on a website that is considered "Real" to some

    That's not real, that's int.
  • by jasen666 (88727) on Thursday October 19 2006, @12:17PM (#16503915)
    Alexa is flawed from the start.
    What impetus or benefit would a user have to install a toolbar that tracks them? Other than out of charity to help out this company? I don't get it. Nor do I particularly trust them. Just one more thing to help crash IE.
  • WTF is Alexa? (Score:3, Insightful)

    by Yvan256 (722131) on Thursday October 19 2006, @12:36PM (#16504203) Homepage Journal
    Seriously, why bother writing two or three sentences anymore? Just put a single link on a single word, that's even less helpful and even less work for the editors.

    WTF IS ALEXA?

    Another case of "I don't want to waste 30 seconds to explain WTF the news is about, let 50K users waste a few minutes and slashdot a website trying to figure out what it is".
  • by MrNougat (927651) <[moc.liamg] [ta] [hcstarkc]> on Thursday October 19 2006, @12:54PM (#16504479)
    Whenever you conduct a poll, and that's what Alexa is doing, you are always excluding data from those who do not respond to polls (for whatever reason). It's an inherent flaw in polling.
  • by Anonymous Coward on Thursday October 19 2006, @12:59PM (#16504585)
    Wikipedia constantly uses Alexa to see if linking to a website or profileing a website is "notable". Despite outrage by the people who submitted the content, usually everything that gets nominated for deletion has some editor cite alexa as a reason to delete it.

    • MPU. Of course, the majority of notability nazis are simply ignorant of phenomena outside their part of the world and cultural comfort circles. There've been plenty of NN charges against regionally significant topics, all because the detractor has never heard of it. Googlecounts and Alexa are among their sacred ammunition.
  • I refer you to the "plog".
  • Alexa's usefulness is more statistical than exact. I don't use it to see specifically how much traffic a site gets, but to see how it changes over time. As long as their measurement standards don't deviate too much, then this can still be useful information.
  • When Alexa first came out, I was willing to use it. There were two features that it provided, and page ranking was actually the least important. Far more important to me was the goal of building an inverted index of the web -- tracking who linked to this site I was looking at, rather than seeing who this site links to.

    All that changed when Alexa was bought by Amazon. And then the truth came out -- all the information that I thought was private was in the database, and now owned by a commercial company, with
  • Any webmaster worth their weight in salt knows Alexa data is flawed. This isn't news for nerds.

    I do use alexa to measure the relative worth of my sites vs competitors. The data never conicides with my own analytics against referral logs and that's ok. Alexa sucks like that.
  • by sien (35268) on Thursday October 19 2006, @06:05PM (#16509891) Homepage
    The real reason for TV stats is for advertisers and TV stations to work out how much they can sell advertising for.

    What is the reason for web stats? If you're paying per view or per click then the information is directly available.

    This leads to an interesting possibility. The ad providers could provide a ranking of sites based on the number of adds that they show there and the number of clicks that are created. This is, of course, open to manipulation via click fraud and other techniques but it would probably be more accurate than Alexa's rankings.

    Then, if you wanted to improve this even more you could combine this with the number of searches that go to a page. A large net firm that provided these services could do such a ranking. Google or Yahoo could do this. Perhaps they do, for their internal consumption.

    • by technoextreme (885694) on Thursday October 19 2006, @11:34AM (#16503263)
      The problem is that statisically it's nice to say that 30% does not make a majority but Im sure that spreads changes from website to website. Imagine looking at the statistics for a Linux website. The majority there better not be IE.
      • Imagine looking at the statistics for a Linux website. The majority there better not be IE.

        You'd like to think so, but I'd bet that there'd be not a few wannabe types who talk the Linux talk but haven't the cajones to walk the walk. I've known many people in my time who were often vaunted for their supposed computer literacy, and who would often sing Linux's praises whilst bashing Windows, but when it came to installing it on their own systems, it was a different story. Partitioning too scary I suppose.
      • Well, a few years back a Slashdot poll [slashdot.org] showed that IE usage here was approaching the 30% level. Of course, that isn't exactly a truly reliable, scientific poll, but it still shows that you can't simply dismiss IE usage just because it's Slashdot, or even not the majority. It'd be interesting to see where these numbers are at today.
      • Imagine looking at the statistics for a Linux website. The majority there better not be IE.


        Why not. Say I have a site about changing from Windows to Linux, I would imagine that the majority of people would be Windows users. If not, I would have missed my target group.
    • Given the slashdot's geek crowd, its hard to grasp that alexa gets its numbers from Alexa toolbar installed slashdotters..

      So a corollary of that would mean that,higher the number in Alexa, higher the number of 'lame' users of a website who actually installed a Alexa toolbar.
    • Their statement can be taken two ways:

      1. "browsers" refers to software. Incorrect, as you pointed out.

      2. "browsers" refers to the people using the software to browse. Valid, and accurate. Alexa isn't supported for most users (could be and sometimes are called "browsers") in the world.

    • Last time I checked, the term 'most' meant a majority.

      Firefox, Safari and Opera may have significant market penetration, but 30% a majority does not make.

      If you're talking about "browser instances" or "browser installations," then it would be incorrect.

      "...not supported for most browsers in the world."

      Assume there are 100 actively developed browsers in the world (there are probably many more, but for the sake of argument). IE is 1 browser. That would make "other browsers" 99% of all browsers in the world, a

    • by daeg (828071) on Thursday October 19 2006, @11:51AM (#16503553)
      It doesn't matter, though, since the distribution of toolbars is not uniform across all Internet users. A good example is the website I work on. We know our traffic, yet Alexa under-reports us. We also know a local competitor's traffic -- both sets of numbers are generally public information that advertisers use. They have a nice site but get about 1/2 of our traffic, yet Alexa over-reports them over us by a factor of 3-4.

      You can pull accurate statistics if and only if your data points are distributed correctly. Because Alexa has no way to randomly and accurately assign toolbars to users, their data is not reliable in any form.

      A similar example is how political polls are taken. You can get accurate numbers with 1,000 adults if, and only if, those 1,000 are random throughout the entire population. You can skew the poll numbers by polling 1,000 Democrats or Republicans only instead of 1,000 random. Your results are only accurate to your surveyed population -- in Alexa's case, their numbers are only accurate so far as "Rank ### amongst Internet Explorer 6.0 users who speak a limited number of languages who have voluntarily installed our toolbar to submit their surfing habits to us for analysis and are subjected to trade secret methods of ranking".

      The only way that you could pull accurate numbers would be through all ISPs selecting random data points to find what hostnames people were using. It would have to be filtered, though, to produce accurate numbers in terms of actual "website hits" instead of just "website requests". Keep-alive would further impede accurate results. As would proxies, DNS caches, and HOSTS files.
      • "Rank ### amongst Internet Explorer 6.0 users who speak a limited number of languages who have voluntarily installed our toolbar to submit their surfing habits to us for analysis"

        What a great line! I'm going to steal it for next week's meetings.
    • Indeed. In other words, it's a great tool for anyone wanting to know which websites are most frequented by the vast majority of internet users.
    • The page you linked to is the stats for their "Business" category, which represents a small portion of their total page views...
    • Google has it's own system for determining the importance of a page - and while it's still flawed, and only really geared towards their own goals, it does a good job of showing the importance of a website.

      Google has more than that, actually. It has a pretty good idea of traffic patterns for any sites that use its free Google Analytics [google.com] visitor/hit tracking software.

      I wonder if that data gets factored into the page rank at all... probably not, at least not yet, but I imagine such information could be used

      • I am an active Wikipedia contributor, and I agree with you wholeheartedly on this. It's probably my biggest complaint about a site I otherwise like.