Forgot your password?
typodupeerror
The Internet Editorial

The Real Problem With Alexa 372

Posted by CmdrTaco
from the get-my-irk-on dept.
Alexa drives me nuts. It uses a broken methodology to measure the internet and is, for reasons unclear to anyone, regarded as somehow definitive simply because it allows you to compare two sites with a single simple number. Its sampling methodology is flawed and the numbers it produces are meaningless. And if you want to help me prove this, please install their toolbar. Of course since most of you are Slashdot readers, most of you won't and that only helps prove my point. Read on for what I mean by all of this, and why it matters.

As the defacto 'Guy in Charge' of a reasonably large web site, I am routinely asked questions by a variety of people that lead inevitably to Alexa. It might be a question from my Boss at SourceForge about traffic. Or it might be a sales guy asked by a possible advertiser why some other random website is bigger or smaller than Slashdot. Most often it's a random reporter doing background for a story that has nothing to do with Slashdot. Why I'm considered an expert is very confusing, but why they always regard Alexa rankings as meaningful is even more so.

Here's the problem: Alexa doesn't work because of who will install it, and perhaps more importantly, who won't. Let's start with a place I'm very familiar with: Slashdot readers. Until recently Alexa didn't work on Firefox... instead only IE users participated. On the internet as a whole that's fine: like 80% of users run IE. But on Slashdot only like a quarter of you do.

What about re-installing the plug-in after you update your browser? When Firefox 2.0 came out, almost a third of Slashdot readers upgraded within a few days. You upgrade Minor Firefox releases overnight. Even IE users of Slashdot update relatively fast, from 6 to 7 or even minor revisions. New versions often break old plug-ins. When you get that alert that a plug-in is out of date do you just forget about it? I know I do. And that's not even counting clean OS installs. But if I went to random non-technical friends and family installations, I frequently see versions of software so dated it makes me cringe.

And that's not even talking about the fact that Alexa's toolbar is pretty much spyware. How many Slashdot readers are giddy to install spyware? You either? Big surprise. Because of who we are, and what it is, our population will self select out of consideration.

Did you know Alexa excludes SSL? How many etrade users do you think there are? Now personally I'm glad that they aren't tracking my browsing at my credit card company, but it's just another factor reducing accuracy.

Equally perplexing is the accounting of iframes. Let's look at someone like double click's alexa rating. Now it's hard to say, but I don't think I've ever visited their website. Have you? But according to Alexa, they have nearly a 1% share of the internet. I'd tend not to believe it... but they have iframes on zillions of web pages and counting those sure would account for this huge ranking. What about all those badges for the popular social networking websites? What influence are those iframes having on Alexa rankings? Alexa's FAQ says they don't count, but I'm skeptical.

In Fact, Alexa KNOWS that it is a flawed metric for measuring. Have you ever tried actually looking up alexa on alexa? Unsurprisingly, it is unavailable. Why? Visitors to Alexa.com would be the most likely of any user population on-line to have installed their plug-in. I don't know what their 'Rank' would be, but I bet it clearly would be an apples to oranges comparison against ANY other site on-line.

Of course who do you think actually will go out of their way to install something like this? I have a good guess... if you are obsessed with acronyms like SEO or terms like PageRank you are very likely to care very much about these things. I spend a real percentage of my week dealing with people flooding my systems with garbage content designed to screw with these ratings. And you know they all have the toolbar installed so their zillions of worthless spam websites are being counted.

This problem has parallels elsewhere of course: The Nielsen ratings struggle to account for PVRs. Since you got a TiVo, when was the last time you watched "Live" TV? This is part of why Science Fiction shows struggle on TV... scifi fans are early adopters. So we stopped getting counted and our favorite genres are butchered by networks and lost to the void. PVR users tend to be wealthy (those boxes are expensive) and educated. Now I'm not saying that the dumbing down of TV is exclusively the fault of Tivo, but it sure didn't help that we weren't being counted as excellent "Smart" TV shows get canceled while we keep getting more seasons of Survivor. Who we are and how we live causes us to not be counted, and this has unintended consequences.

So what do we do? I wish I had a good answer to this. My first suggestion would be that if anyone mentions Alexa to you that you freak out and go on a 5-minute rant about how Alexa is stupid and anyone who is using it to seriously make a business decision should be fired. It doesn't actually help, but i estimate that every time I do this, I burn the same number of calories as I might on an elliptical trainer. I assure you the beer gut ain't getting smaller on its own.

Alternatively you could just install the toolbar on every machine you can find and skew the numbers ridiculously towards people that are likely unrepresented. Of course, the conspiracy theorists amongst you will just bitch that I'm trying to fudge Slashdot's own rankings in a system I'm claiming to hate. But that only helps proves my point... the conspiracy theorist is a demographic strongly represented on Slashdot that is unlikely to trust this software. We all ignore a broken status quo "Gold" standard that would fail a 100 level college science class on the grounds of flawed methodology. And this only leads to us not being counted.

This discussion has been archived. No new comments can be posted.

The Real Problem With Alexa

Comments Filter:
  • Re:Rant as news (Score:5, Informative)

    by networkBoy (774728) on Monday July 23, 2007 @11:09AM (#19957285) Homepage Journal
    Of course it's a rant, it's an editorial.
    The tags were there before TFA.
    Furthermore you will need to re-read it because of your race to FP you likely only read the front page blurb. /rant.
    -nB
  • by mmxsaro (187943) on Monday July 23, 2007 @11:17AM (#19957433) Homepage
    Alexa [alexa.com] is a ranking system to measure how popular a certain website is on the Internet. A user, however, must have the Alexa toolbar installed for Alexa to measure site rankings accordingly. As of right now, Slashdot is ranked 558 out of 1 million+ sites that Alexa tracks.

    Note: you don't need to install the toolbar to figure out Alexa rankings. Check out the Search Status [quirk.biz] extension for Firefox. I have mine sitting at the bottom right corner of the browser to display me PageRank and Alexa rankings.
  • Business? (Score:4, Informative)

    by 19061969 (939279) on Monday July 23, 2007 @11:19AM (#19957473)
    In my experience, a lot of PHBs are only too happy to have information. They don't really care if it's valid information or not, just so long as it's there and that it sounds good.

    It was a massive wake-up call to realise how many middle-managers and the like will quite happily swallow any old crap as long as they perceive that it's authoritative. Has anyone ever tried to tell them about how bad the information is? (real question btw - I'm interested in seeing if other readers experiences were as bleak as mine).
  • Alexa is useful (Score:3, Informative)

    by mbone (558574) on Monday July 23, 2007 @11:20AM (#19957479)
    It clearly has biases, and (worse) these seem to change slowly with time, but for the web sites I host, there is a nice correlation between their Alexa reach and their
    hit count.

    It is certainly good for a crude ranking of sites - Slashdot's rank right now is 558, and that clearly means a lot more traffic than some site than a rank of 5 million.

    So, like many other measures on the Internet, it is flawed, but it has value.

  • Re:Alexa's Spiders (Score:5, Informative)

    by captnitro (160231) * on Monday July 23, 2007 @11:22AM (#19957521)
    I'm just ragging on you unnecessarily here -- but was Alexa following POSTed form actions or something? This is why there's a completely different verb for the alteration or deletion of a URI object (POST) vs reading one (GET). (And shame on somebody for sticking usernames and passwords in GET variables, if that was the case.) /nitpick
  • more specifically, (Score:3, Informative)

    by everphilski (877346) on Monday July 23, 2007 @11:34AM (#19957747) Journal
    trackware, not spyware, from your link: "is a program that installs a toolbar and gathers Internet browsing and search information." which is EXACTLY WHAT IT IS SUPPOSED TO DO in order to aggregate site popularity.
  • by jamie (78724) * Works for Slashdot <jamie@slashdot.org> on Monday July 23, 2007 @12:00PM (#19958089) Journal

    Hi Mandrake.

    Slashdot still logs every pageview (plus ajax). We drop them into MySQL and once a day run a data-massaging script on them then delete the oldest portion. We do have a pair of dedicated servers for this, but generally speaking the I/O is pretty low. It's very doable.

    One of the main reasons is detecting abuse in real-time (done by more scripts that run more frequently). I wrote a journal entry [slashdot.org] about one of those scripts, a while back.

  • Re:Spyware? (Score:3, Informative)

    by TheSHAD0W (258774) on Monday July 23, 2007 @12:27PM (#19958433) Homepage
    From the above article.

    And that's not even talking about the fact that Alexa's toolbar is pretty much spyware. How many Slashdot readers are giddy to install spyware? You either? Big surprise.

    I'd mod your article redundant, but I believe the point does need emphasizing.
  • by sitarah (955787) on Monday July 23, 2007 @12:33PM (#19958533) Homepage
    Then don't use Alexa. You can measure traffic in ways beyond toolbars. Try the following:
    1) Compete. It is free. It uses toolbars AND panels AND isps. It's not that accurate compared to Comscore.. but maybe Comscore is wrong.
    2) Buy something. Comscore uses a panel method with a careful demographic spread so they can extrapolate from their sample with a small percent-error.
    3) Buy Hitwise for percentages. It doesn't give you unique visitors, but it can give you comparisons and ranks and whatnot. They lay on top of ISPs and use a few panels. It is 20-30K for a year or so.
    4) Wait awhile until the IAB's audit leads to some common definitions and standards among the aforementioned companies. The Interactive Advertising Bureau and Media Rating Council are auditing Nielsen and Comscore to make sure there is more transparency into what defines all these metrics, how they are counted, and how they should be counted, forever after. In a few years, there might be some consistency in the industry, which will at least stop you from comparing apples and oranges once you get beyond over-counting SEO spammers.

    If you are concerned about the demographics and unselfconscious web surfing, you need to go with a company that looks at ISP data. That's right, everyone -- your service contracts with larger ISPs allow them to anonymously watch your traffic and sell it to companies like Hitwise with your demographic information. Suddenly, the 35-54 white male demographic with a 80K income in the south can be fully represented in the balloon-popping video site genre, until they start hiding behind a proxy. Because it is anonymous, it is even better than a panel, because they don't know they are being watched and don't change their behavior.
  • Re:Spyware? (Score:2, Informative)

    by Miseph (979059) on Monday July 23, 2007 @12:36PM (#19958583) Journal
    Parent's link is to a photo of an enlarged and strangely external male anus, and while not the classic goatse, is certainly a related image.

    Just in case anyone was wondering what the -1 Troll actually meant.
  • by mmxsaro (187943) on Monday July 23, 2007 @01:16PM (#19959141) Homepage
    I've heard of Alexa when it first came out in the mid 90's (think early 1997). I recall many people adopting it for its search engine capabilities and ranking features ("What's popular on the web today?"). Think of this as the pre-Google era, when searching through millions of pages was a daunting task and you didn't know where to surf when you first connected to the Internet via dial-up. You'd install the toolbar through word-of-mouth (or you saw some flash banner ad...) thinking that it will help you find what you're looking for on the web. I guess the toolbar just stuck around and webmasters/SEO 'experts' picked it up a while back thinking it's great data to judge website traffic. True SEO experts will never worry about Alexa data, as it's very easy to manipulate it. To some extent, you COULD say that Alexa's rankings are semi-accurate (although not precise in any way). If you have 1-2 million toolbars active and you want to see what's hot out there, Alexa isn't a bad place to start, but just like Slashdot's own poll system; "This whole thing is wildly inaccurate. Rounding errors, ballot stuffers, dynamic IPs, firewalls. If you're using these numbers to do anything important, you're insane."
  • Re:Alexa's Spiders (Score:2, Informative)

    by Jah-Wren Ryel (80510) on Monday July 23, 2007 @01:21PM (#19959193)

    If you call delete twice on the same record, the second time will have a failure result, rather than a success result like the first.
    You are msinterpreting "result" as "return value" -- idempotent is a mathematical term, and in math there are no "return values" just end results. Deleting always produces the same end result.

    Here's what the W3C says about it, although they also talk about a specific DELETE request on the same level in the http protocol as GET - http://www.w3.org/Protocols/rfc2616/rfc2616-sec9.h tml [w3.org]
  • Re:Spyware yup. (Score:2, Informative)

    by mmxsaro (187943) on Monday July 23, 2007 @01:28PM (#19959315) Homepage
    Not true. If you're using the Corporate version of Symantec Antivirus, you can allow Alexa on your computer by simply excluding it from your searches (see exclusions). There will be Alexa listed as "adware/spyware". Enable the ignore option on it and you should be fine.

We warn the reader in advance that the proof presented here depends on a clever but highly unmotivated trick. -- Howard Anton, "Elementary Linear Algebra"

Working...