The Real Problem With Alexa 372
As the defacto 'Guy in Charge' of a reasonably large web site, I am routinely asked questions by a variety of people that lead inevitably to Alexa. It might be a question from my Boss at SourceForge about traffic. Or it might be a sales guy asked by a possible advertiser why some other random website is bigger or smaller than Slashdot. Most often it's a random reporter doing background for a story that has nothing to do with Slashdot. Why I'm considered an expert is very confusing, but why they always regard Alexa rankings as meaningful is even more so.
Here's the problem: Alexa doesn't work because of who will install it, and perhaps more importantly, who won't. Let's start with a place I'm very familiar with: Slashdot readers. Until recently Alexa didn't work on Firefox... instead only IE users participated. On the internet as a whole that's fine: like 80% of users run IE. But on Slashdot only like a quarter of you do.
What about re-installing the plug-in after you update your browser? When Firefox 2.0 came out, almost a third of Slashdot readers upgraded within a few days. You upgrade Minor Firefox releases overnight. Even IE users of Slashdot update relatively fast, from 6 to 7 or even minor revisions. New versions often break old plug-ins. When you get that alert that a plug-in is out of date do you just forget about it? I know I do. And that's not even counting clean OS installs. But if I went to random non-technical friends and family installations, I frequently see versions of software so dated it makes me cringe.
And that's not even talking about the fact that Alexa's toolbar is pretty much spyware. How many Slashdot readers are giddy to install spyware? You either? Big surprise. Because of who we are, and what it is, our population will self select out of consideration.
Did you know Alexa excludes SSL? How many etrade users do you think there are? Now personally I'm glad that they aren't tracking my browsing at my credit card company, but it's just another factor reducing accuracy.
Equally perplexing is the accounting of iframes. Let's look at someone like double click's alexa rating. Now it's hard to say, but I don't think I've ever visited their website. Have you? But according to Alexa, they have nearly a 1% share of the internet. I'd tend not to believe it... but they have iframes on zillions of web pages and counting those sure would account for this huge ranking. What about all those badges for the popular social networking websites? What influence are those iframes having on Alexa rankings? Alexa's FAQ says they don't count, but I'm skeptical.
In Fact, Alexa KNOWS that it is a flawed metric for measuring. Have you ever tried actually looking up alexa on alexa? Unsurprisingly, it is unavailable. Why? Visitors to Alexa.com would be the most likely of any user population on-line to have installed their plug-in. I don't know what their 'Rank' would be, but I bet it clearly would be an apples to oranges comparison against ANY other site on-line.
Of course who do you think actually will go out of their way to install something like this? I have a good guess... if you are obsessed with acronyms like SEO or terms like PageRank you are very likely to care very much about these things. I spend a real percentage of my week dealing with people flooding my systems with garbage content designed to screw with these ratings. And you know they all have the toolbar installed so their zillions of worthless spam websites are being counted.
This problem has parallels elsewhere of course: The Nielsen ratings struggle to account for PVRs. Since you got a TiVo, when was the last time you watched "Live" TV? This is part of why Science Fiction shows struggle on TV... scifi fans are early adopters. So we stopped getting counted and our favorite genres are butchered by networks and lost to the void. PVR users tend to be wealthy (those boxes are expensive) and educated. Now I'm not saying that the dumbing down of TV is exclusively the fault of Tivo, but it sure didn't help that we weren't being counted as excellent "Smart" TV shows get canceled while we keep getting more seasons of Survivor. Who we are and how we live causes us to not be counted, and this has unintended consequences.
So what do we do? I wish I had a good answer to this. My first suggestion would be that if anyone mentions Alexa to you that you freak out and go on a 5-minute rant about how Alexa is stupid and anyone who is using it to seriously make a business decision should be fired. It doesn't actually help, but i estimate that every time I do this, I burn the same number of calories as I might on an elliptical trainer. I assure you the beer gut ain't getting smaller on its own.
Alternatively you could just install the toolbar on every machine you can find and skew the numbers ridiculously towards people that are likely unrepresented. Of course, the conspiracy theorists amongst you will just bitch that I'm trying to fudge Slashdot's own rankings in a system I'm claiming to hate. But that only helps proves my point... the conspiracy theorist is a demographic strongly represented on Slashdot that is unlikely to trust this software. We all ignore a broken status quo "Gold" standard that would fail a 100 level college science class on the grounds of flawed methodology. And this only leads to us not being counted.
Re:Rant as news (Score:5, Informative)
The tags were there before TFA.
Furthermore you will need to re-read it because of your race to FP you likely only read the front page blurb.
-nB
Re:I must be stupid... (Score:4, Informative)
Note: you don't need to install the toolbar to figure out Alexa rankings. Check out the Search Status [quirk.biz] extension for Firefox. I have mine sitting at the bottom right corner of the browser to display me PageRank and Alexa rankings.
Business? (Score:4, Informative)
It was a massive wake-up call to realise how many middle-managers and the like will quite happily swallow any old crap as long as they perceive that it's authoritative. Has anyone ever tried to tell them about how bad the information is? (real question btw - I'm interested in seeing if other readers experiences were as bleak as mine).
Alexa is useful (Score:3, Informative)
hit count.
It is certainly good for a crude ranking of sites - Slashdot's rank right now is 558, and that clearly means a lot more traffic than some site than a rank of 5 million.
So, like many other measures on the Internet, it is flawed, but it has value.
Re:Alexa's Spiders (Score:5, Informative)
more specifically, (Score:3, Informative)
Re:Asked and answered (Score:5, Informative)
Hi Mandrake.
Slashdot still logs every pageview (plus ajax). We drop them into MySQL and once a day run a data-massaging script on them then delete the oldest portion. We do have a pair of dedicated servers for this, but generally speaking the I/O is pretty low. It's very doable.
One of the main reasons is detecting abuse in real-time (done by more scripts that run more frequently). I wrote a journal entry [slashdot.org] about one of those scripts, a while back.
Re:Spyware? (Score:3, Informative)
And that's not even talking about the fact that Alexa's toolbar is pretty much spyware. How many Slashdot readers are giddy to install spyware? You either? Big surprise.
I'd mod your article redundant, but I believe the point does need emphasizing.
There are alternatives (Score:2, Informative)
1) Compete. It is free. It uses toolbars AND panels AND isps. It's not that accurate compared to Comscore.. but maybe Comscore is wrong.
2) Buy something. Comscore uses a panel method with a careful demographic spread so they can extrapolate from their sample with a small percent-error.
3) Buy Hitwise for percentages. It doesn't give you unique visitors, but it can give you comparisons and ranks and whatnot. They lay on top of ISPs and use a few panels. It is 20-30K for a year or so.
4) Wait awhile until the IAB's audit leads to some common definitions and standards among the aforementioned companies. The Interactive Advertising Bureau and Media Rating Council are auditing Nielsen and Comscore to make sure there is more transparency into what defines all these metrics, how they are counted, and how they should be counted, forever after. In a few years, there might be some consistency in the industry, which will at least stop you from comparing apples and oranges once you get beyond over-counting SEO spammers.
If you are concerned about the demographics and unselfconscious web surfing, you need to go with a company that looks at ISP data. That's right, everyone -- your service contracts with larger ISPs allow them to anonymously watch your traffic and sell it to companies like Hitwise with your demographic information. Suddenly, the 35-54 white male demographic with a 80K income in the south can be fully represented in the balloon-popping video site genre, until they start hiding behind a proxy. Because it is anonymous, it is even better than a panel, because they don't know they are being watched and don't change their behavior.
Re:Spyware? (Score:2, Informative)
Just in case anyone was wondering what the -1 Troll actually meant.
Re:I must be stupid... (Score:2, Informative)
Re:Alexa's Spiders (Score:2, Informative)
Here's what the W3C says about it, although they also talk about a specific DELETE request on the same level in the http protocol as GET - http://www.w3.org/Protocols/rfc2616/rfc2616-sec9.
Re:Spyware yup. (Score:2, Informative)