Please create an account to participate in the Slashdot moderation system


Forgot your password?
The Internet Editorial

The Real Problem With Alexa 372

Alexa drives me nuts. It uses a broken methodology to measure the internet and is, for reasons unclear to anyone, regarded as somehow definitive simply because it allows you to compare two sites with a single simple number. Its sampling methodology is flawed and the numbers it produces are meaningless. And if you want to help me prove this, please install their toolbar. Of course since most of you are Slashdot readers, most of you won't and that only helps prove my point. Read on for what I mean by all of this, and why it matters.

As the defacto 'Guy in Charge' of a reasonably large web site, I am routinely asked questions by a variety of people that lead inevitably to Alexa. It might be a question from my Boss at SourceForge about traffic. Or it might be a sales guy asked by a possible advertiser why some other random website is bigger or smaller than Slashdot. Most often it's a random reporter doing background for a story that has nothing to do with Slashdot. Why I'm considered an expert is very confusing, but why they always regard Alexa rankings as meaningful is even more so.

Here's the problem: Alexa doesn't work because of who will install it, and perhaps more importantly, who won't. Let's start with a place I'm very familiar with: Slashdot readers. Until recently Alexa didn't work on Firefox... instead only IE users participated. On the internet as a whole that's fine: like 80% of users run IE. But on Slashdot only like a quarter of you do.

What about re-installing the plug-in after you update your browser? When Firefox 2.0 came out, almost a third of Slashdot readers upgraded within a few days. You upgrade Minor Firefox releases overnight. Even IE users of Slashdot update relatively fast, from 6 to 7 or even minor revisions. New versions often break old plug-ins. When you get that alert that a plug-in is out of date do you just forget about it? I know I do. And that's not even counting clean OS installs. But if I went to random non-technical friends and family installations, I frequently see versions of software so dated it makes me cringe.

And that's not even talking about the fact that Alexa's toolbar is pretty much spyware. How many Slashdot readers are giddy to install spyware? You either? Big surprise. Because of who we are, and what it is, our population will self select out of consideration.

Did you know Alexa excludes SSL? How many etrade users do you think there are? Now personally I'm glad that they aren't tracking my browsing at my credit card company, but it's just another factor reducing accuracy.

Equally perplexing is the accounting of iframes. Let's look at someone like double click's alexa rating. Now it's hard to say, but I don't think I've ever visited their website. Have you? But according to Alexa, they have nearly a 1% share of the internet. I'd tend not to believe it... but they have iframes on zillions of web pages and counting those sure would account for this huge ranking. What about all those badges for the popular social networking websites? What influence are those iframes having on Alexa rankings? Alexa's FAQ says they don't count, but I'm skeptical.

In Fact, Alexa KNOWS that it is a flawed metric for measuring. Have you ever tried actually looking up alexa on alexa? Unsurprisingly, it is unavailable. Why? Visitors to would be the most likely of any user population on-line to have installed their plug-in. I don't know what their 'Rank' would be, but I bet it clearly would be an apples to oranges comparison against ANY other site on-line.

Of course who do you think actually will go out of their way to install something like this? I have a good guess... if you are obsessed with acronyms like SEO or terms like PageRank you are very likely to care very much about these things. I spend a real percentage of my week dealing with people flooding my systems with garbage content designed to screw with these ratings. And you know they all have the toolbar installed so their zillions of worthless spam websites are being counted.

This problem has parallels elsewhere of course: The Nielsen ratings struggle to account for PVRs. Since you got a TiVo, when was the last time you watched "Live" TV? This is part of why Science Fiction shows struggle on TV... scifi fans are early adopters. So we stopped getting counted and our favorite genres are butchered by networks and lost to the void. PVR users tend to be wealthy (those boxes are expensive) and educated. Now I'm not saying that the dumbing down of TV is exclusively the fault of Tivo, but it sure didn't help that we weren't being counted as excellent "Smart" TV shows get canceled while we keep getting more seasons of Survivor. Who we are and how we live causes us to not be counted, and this has unintended consequences.

So what do we do? I wish I had a good answer to this. My first suggestion would be that if anyone mentions Alexa to you that you freak out and go on a 5-minute rant about how Alexa is stupid and anyone who is using it to seriously make a business decision should be fired. It doesn't actually help, but i estimate that every time I do this, I burn the same number of calories as I might on an elliptical trainer. I assure you the beer gut ain't getting smaller on its own.

Alternatively you could just install the toolbar on every machine you can find and skew the numbers ridiculously towards people that are likely unrepresented. Of course, the conspiracy theorists amongst you will just bitch that I'm trying to fudge Slashdot's own rankings in a system I'm claiming to hate. But that only helps proves my point... the conspiracy theorist is a demographic strongly represented on Slashdot that is unlikely to trust this software. We all ignore a broken status quo "Gold" standard that would fail a 100 level college science class on the grounds of flawed methodology. And this only leads to us not being counted.

This discussion has been archived. No new comments can be posted.

The Real Problem With Alexa

Comments Filter:
  • Spyware? (Score:2, Insightful)

    by xXenXx ( 973576 ) on Monday July 23, 2007 @12:03PM (#19957201)
    Isn't Alexa considered spyware?

    It baffles me how people actually look to them for information, considering how they get it.
  • by Control Group ( 105494 ) * on Monday July 23, 2007 @12:06PM (#19957233) Homepage
    That's all true, but unless someone's got a better alternative, it doesn't matter.

    It isn't surprising that people who spend money on advertising want to have some metric by which to predict (estimate, guess, what-have-you) the impact of each dollar spent on web advertising. Assuming the people spending the money are, as a class, either stupid or ignorant is a mistake. Odds are good that many of them know that Alexa is flawed, but also consider any information better than nothing. If nothing else, Alexa rankings demonstrate the relative popularity of a web site among Alexa participants - which is at least a concrete demographic, and the stats are inarguable on that basis.

    What's being missed is that there's a fundamental problem, here. Populations which refuse to share information with such aggregators will always self-select against representation. It's no different, really, than stating that populations who do not vote self-select against being represented in government. That doesn't stop us from using elections as a way to select people into government.

    In the specific case of slashdot selecting against itself, it's debatable whether we're a demographic many organizations would even want to target (with web advertising) if they could. How many comments on how many stories have included someone claiming that he's either unaffected by or negatively affected by advertising? That he's less likely to buy a product he sees advertised? Broader yet, how do you suppose the median number of lifetime banner ad clicks for the slashdot user compares to that of the web-using population at large?

    I posit that we pose a particularly galling challenge to marketers. On the one hand (if you'll allow me a bit of net-cultural hubris), we're a demographic of above-average intelligence, above-average income, with an above-average tendency to spend money on brand new technology, and who have an above-average impact on what other people will buy. On the other, we refuse to share our habits with "big brother," we're easily offended (eg, we hate proprietary formats solely because they're proprietary), comparatively hard to bamboozle, and have a cultural predisposition towards "free" (both beer and speech). That is, on the one hand, we're a fantastic demographic to succeed with, but on the other, we're a tough nut to crack.

    The point is that Alexa is flawed, without a doubt. But it seems more flawed from the point of view of a group which deliberately makes itself all but impossible to measure. And frankly, if we're not willing to provide the information necessary for advertisers to make informed choices, we're going to continue to be ignored, both on the web and on television. (Yes, I do realize that Nielsen is specifically flawed with respect to DVRs - but even if they weren't, how many members of this site would voluntarily install habit-tracking software on their TiVo? How many members of this site would call for a boycott of TiVo if it installed it for them?)
  • Re:Rant as news (Score:5, Insightful)

    by tabacco ( 145317 ) on Monday July 23, 2007 @12:08PM (#19957271)
    That's probably why it's filed under 'Editorial'.
  • Re:So... (Score:3, Insightful)

    by eln ( 21727 ) * on Monday July 23, 2007 @12:14PM (#19957377)
    He says install it, and then in the very next sentence says that he know you won't, because you're a Slashdot reader. The entire rest of the post is about why Alexa is flawed and shouldn't be used for anything by anyone.

    Sure, from a purely mercenary point of view he'd like you to install it so that advertisers stupid enough to use Alexa will see Slashdot's traffic represented, but he acknowledges that you almost certainly won't. Taco has never had the kind of relationship with his readership to where he could tell them to do something and they would go out and do it, unless that "something" was "post goatse links to Slashdot," and I'm pretty sure he knows that.
  • by saibot834 ( 1061528 ) on Monday July 23, 2007 @12:14PM (#19957389)
    So Alexa says they are not spying on the user. Big surprise.

    How can I verify what this toolbar is really doing unless I have the source code? IMHO the problem lies there: There is no trust for Alexa because nobody can really say for sure how it works and that it doesn't harm the user.
  • And we care...why? (Score:4, Insightful)

    by Itninja ( 937614 ) on Monday July 23, 2007 @12:17PM (#19957429) Homepage
    How is Alexa different than any other selective-survey system? The Nielsen ratings are acquired via 'diaries' (or occasionally set-top boxes). Radio 'listener share' is determined similarly by Arbitron. The NY-Times bestseller list is based on books sold to distributors, not books sold to the public (millions of unsold 'bestsellers' get pulped or donated to libraries every ear).

    Just come to terms with the fact these organizations are in bed with advertisers and move on with you life.
  • Re:Alexa's Spiders (Score:1, Insightful)

    by Anonymous Coward on Monday July 23, 2007 @12:20PM (#19957481)
    why were usernames and passwords able to be captured in the first place...

    chroot httpd
  • by Anonymous Coward on Monday July 23, 2007 @12:21PM (#19957495)
    It's very different than the Nielsen ratings. Nielsen does not let random users join their system. They seek our a statistically valid sample population, and pay them for using the system. That is extraordinarily different.
  • by Anonymous Coward on Monday July 23, 2007 @12:23PM (#19957535)
    Alexa targets a demographic which are more likey to click on banner ads and buy the junk which they advertise. So for the advertisers targeting those demographics I'm sure it works out ok.
  • by LWATCDR ( 28044 ) on Monday July 23, 2007 @12:27PM (#19957609) Homepage Journal
    Slashdot is an extremely popular website with great demographics. It should be a huge money maker but it probably under performs.
    It doesn't show up all that well in Alexa because very few people that go to Slashdot use or would use the Alexa toolbar.
    It probably doesn't show up all that well with the advertisers because Slashdot readers are technically very sophisticated.
    What percentage of Slashdot users are blocking the ads on Slashdot? 80%? Slashdot should be the "Myspace" of the technical crowd. Heck it had the friends list long before Myspace was around. We have our Journals "aka" blogs so yea it is a little Myspace full of bright people with money to spend. But it doesn't make that much money. Slashdot should be worth many millions but it isn't. The real problem isn't Alexa but how can Slashdot live up to it's potental for that evil word. Profit. After all I am sure the Slashdot crew would like to make the big bucks.

  • by Control Group ( 105494 ) * on Monday July 23, 2007 @12:28PM (#19957629) Homepage
    Sure, but that all presupposes you've already bought ad space on the site in question. When you're trying to select which web sites to purchase ad space on in the first place, you don't have access to any of those metrics. If we were talking about a handful of key sites, that wouldn't be a problem - test the waters, go with what works.

    But given the huge number of web sites out there that run ads, you need some way of doing an initial selection of which ones to pay. Hence Alexa.
  • by Opportunist ( 166417 ) on Monday July 23, 2007 @12:30PM (#19957671)
    As a statistician, I can reassure you that the only thing that's worse than no data is flawed data. When you have no data, you know something is wrong and you start correcting that. When you have flawed data, you don't. Instead you use that data and build on it, never knowing that what you measure, calculate and estimate has nothing to do with reality. In other words, it can be dangerous, to your job and the company you're working for.

    Imagine the (flawed) data you have tells you that almost 100% of the people visiting your geek-gadget page are fans of some rock group. Why? Because they use a proxy that was written by some fan of said rock group whose proxy subtly alters the meta information sent by your browser to tell everyone you surf to how much you like said rock group. You analyze it and invest heavily into marketing crap from said group, hoping that your customers will buy it since they all appearantly love that group.

    Result? Big desaster. Nobody buys it. Nobody even knows that group. They just all used the same proxy/plugin/younameit, not even knowing that whoever wrote it wanted to advertise his favorite band.
  • by a16 ( 783096 ) on Monday July 23, 2007 @12:34PM (#19957745)
    I don't think that your story is a very good indicator of how rubbish Alexa is, it just highlights issues with your own system.

    This is why you shouldn't use HTTP GET for 'delete links'. Anything that changes content should be POST, which will stop bots crawling your site just by following links from breaking things. We have standards for a reason..

    As for alexa crawling your site as a logged in user, what? As far as I know the toolbar itself doesn't do any crawling, only reporting. Maybe it was providing links to Alexa that later got indexed, but if they were properly secured then you wouldn't have any issues. The fact that you seem to be relying on a robots.txt for security indicates bigger issues. The only time I've heard of a 'Toolbar' doing this kind of thing is when Google released their proxy service (which they later withdrew), as it automatically preloaded all pages - and again poorly designed pages using GET to modify data encountered problems just like yours.
  • by Atlantis-Rising ( 857278 ) on Monday July 23, 2007 @12:34PM (#19957757) Homepage
    While you cannot verify what the software is actually DOING, you can monitor/verify what the software is saying.

    In many cases, not only is the latter more effective, from a cost/time/benefit perspective, it's also easier and provides far more useful information.

  • by Control Group ( 105494 ) * on Monday July 23, 2007 @12:38PM (#19957819) Homepage
    That's true if the person using the data is unaware that it's flawed. But an educated decision can be made to use data that's known to be flawed, if one evaluates what those flaws are, and what they'll mean to whatever it is you're doing.

    In fact, as I think about it, I'm not sure "flawed" is the right word. The information is incomplete; whether that's a flaw depends on whether or not you recognize that you don't have all the information.

    I think that assuming all the people using Alexa rankings to make purchasing decisions are stupid is misguided. I think it's a much safer assumption that the distribution of stupid, average, and intelligent people among that population is fairly close to that of the population at large. Many of them are making decisions based, in part, on having information that they know to be incomplete, which they judge to be preferable to making decisions based on having no information.
  • by DigiShaman ( 671371 ) on Monday July 23, 2007 @01:10PM (#19958205) Homepage
    It's the quality, not the quantity of your audience that matters. Despite the occasional trolling and flaming that goes on at Slashdot, it still uphold its audience as the most informed and highly intelligent. I can't say that for
  • Re:Alexa's Spiders (Score:3, Insightful)

    by someone300 ( 891284 ) on Monday July 23, 2007 @01:33PM (#19958541)
    From Wikipedia:

    a single call or multiple calls produce the same result and the same side effects to the entire system as a whole.
    It says here, not just the same effect on the system as a whole, but also the same result.

    If you call delete twice on the same record, the second time will have a failure result, rather than a success result like the first. Doesn't this make deletion non-idempotent?
  • Re:Rant as news (Score:4, Insightful)

    by JWSmythe ( 446288 ) * <jwsmythe@jws[ ] ['myt' in gap]> on Monday July 23, 2007 @01:44PM (#19958699) Homepage Journal
    There's a neat thing in journalism. Editors retain the privilege of being able to commandeer any space they want in their publication, and say just about anything they want. In the format of Slashdot, the editorial would take the top most position on the page, until a newer story filled the position.

        I have been known to do the same thing on my site. It may be a "thank you" to our users. It may be a birth, death, or wedding announcement. It may just be that a particular topic has infuriated me to the extent that I needed to put my opinion in big bold letters on the front page, because no matter how much we may report on the topic, people still don't have a clue about the meaning.

        If Cmdr Taco had posted a news story on the poor metrics used by Alexa, would that have received the same attention that his own personal account did?

        Unfortunately, he's echoing what many of us already feel. Boss type people feel the need to rank high with Alexa. If the ranking goes down for any reason, they want it brought back up. Even on some of the lesser technical sites, discussions start about spyware, and people start removing these utilities. When that happens, the score for those sites drop, and the score for and go up. (when's the last time you de-spywared the kids computer?)

        I know from reliable information, that my news site is read heavily by those in the intelligence services around the world. I sincerely hope that they wouldn't have the Alexa toolbar on their machines. Most of our users are very aware of what's happening around them. They're the ones that are careful to keep their machines clean of viruses and spyware. That leaves us with the random users who follow links from other news sites, or find us in search engines. Maybe they'll stick around. Maybe they'll even learn something.

  • by Strilanc ( 1077197 ) on Monday July 23, 2007 @01:59PM (#19958917)
    That won't work well with ads in general, because your desires change over time. For example, you'll be interested in car ads only when you're considering buying a new car.

    Also, people like me would just vote every ad down until we didn't have to see anything. If I want to see advertisements for a product I'LL GO LOOK FOR IT.
  • by Dracos ( 107777 ) on Monday July 23, 2007 @02:02PM (#19958963)

    Always was, always will be. After decades, there is no agreed upon methodology for tracking the effectiveness of marketing dollars in the real world. The internet should make it easier, right? Perhaps, until people learn how to filter the internet. Doubleclick never sees me, because I have *

    in my hosts file, along with 37,000 other crap sites. I also add "*urchin\.js" to my custom filters in FilterSetG, so AdSense doesn't see me. I suspect other Slashdotters take similar measures.

    If a good click through rate on a banner ad is less than 1%, and only about 1% of clicks result in a sale, then the value of that banner to the advertiser is only .01% of it's cost (yes, I know AdSense works differently, but it has its own pitfalls). Pathetic, isn't it?

    It makes you wonder how poorly traditional media ads actually perform.

    Banner ads, I'm pretty sure, are the first time advertisers have ever been able to measure the returns on ad dollars. Some company spends $20k for a full page ad in a magazine, how much of that came back in sales? No one knows. So just to me sure they don't lose sales, the company continues to buy ads, following some rough percentage of revenues. Demographics is the closest thing marketers have to concrete data... it basically says not to buy ads in Ladies' Home Journal if you're selling vintage car parts. Even then, demographics measures potential returns before the fact, not actual returns after the fact. So, advertising is a wild goose chase based on assumptions, and no one does, or can, really know what's going on.

    The internet should be a wake up call for advertisers to the fact that their marketing budgets are being overinflated by... (wait for it) the ad agencies and marketing firms. Sadly no one will realize this, because the foxes are in charge of the henhouse, and claim everyone will fall to ruin otherwise.

    Generally, people don't want the crap in the ads, and would rather not even see the ads. Horrible conversion rates prove this. The scariest part of Minority Report, other than the nanny-state concept of "pre-crime", is the level of advertising present everywhere in the film, targeted at individuals with laser-like precision. It got that way because the public allowed it to happen.

    The simplest way to fix advertising is to remove all imperative and presumptuous statements from them. No more "Call now!", "You need...", "But wait, there's more!" obnoxious mind games. I'm not calling, I don't need your shit, and I'm not waiting for you to yell at me some more.

  • by ColdWetDog ( 752185 ) on Monday July 23, 2007 @02:08PM (#19959041) Homepage

    Well, digg has been beating slashdot for a year now, and is nearly a magnitude higher in rank.

    No, i guess the most recent event is 4chan passing slashdot...

    So idiots in general and pornography obsessed idiots in particular are more common than whatever-it-is-that-lurks-about Slashdot?

    Somehow, I feel better already.

  • As a statistician, I can reassure you that the only thing that's worse than no data is flawed data. When you have no data, you know something is wrong and you start correcting that. When you have flawed data, you don't.

    This is a huge assumption that I'd say is incorrect more often than not.

    Your entire argument as it stands now presupposes that the advertiser doesn't know the data is flawed. But what if he does?

    My company buys lots of web ads. We use Alexa as one of our data sources (not the only one) to determine ad buys, both because it's free and because in our experience, its data is no more or less accurate than that of paid vendors like Nielsen. Do we expect 100% accuracy? No. Do we think we can learn anything if, for example, it tells us that two directly competing sites have traffic that's different by about 200% in every metric? Probably.

    Buying ads is not an exact science. It doesn't really matter if we get accurate traffic down to the individual click. All we're looking for is relativity - a site's size and reach compared to its competitors. We look at the sites themselves, we look at Alexa and we look at research that we commission and pay for. Usually these sources all agree and we go ahead and buy. In the event that they don't agree, we use our own critical thinking and our own judgment to determine what to believe - that is part of any marketer's job, after all.

    It seems to me that this whole article here is missing the point. Alexa's a tool. A free tool. It is useful at what it does, but it is not, nor was it ever intended to be, some sort of accurate measure of site statistics for the entire internet. Nobody who uses it as part of their decision-making process is using it that way.

    I think this is a case where somebody looked at Alexa, figured out that it wasn't perfect, and therefore determined that it's utter crap. That's basically what your argument boils down to also. But the point is we don't need perfection, and we don't expect perfection, and this lack of perfection is taken into account in our decision making process. We're not flying to the moon here; we're buying ad space. It's something of an organic process regardless of how good your data is.

    If you're talking about somebody using Alexa for their own site, then that's just ridiculous. Even cheap hosting accounts (like I have for my personal site) come with their own log-based stats, and if not, there are plenty of free services like Statcounter out there. I don't think this is what many people use Alexa for, though; it's used more by small to mid-sized companies looking for sites on which to buy ads, or by curiosity seekers who just want to see how big their favorite sites are. I would think most sites would know what their own internal numbers are one way or another, without Alexa.
  • Re:Rant as news (Score:4, Insightful)

    by FST777 ( 913657 ) <frans-jan AT van-steenbeek DOT net> on Monday July 23, 2007 @02:30PM (#19959361) Homepage
    When talking about technological things like page-ranking and Alexa's use on that, yes. Yes they are.

    It's not that they are dumb in the wide version of the word, but in the techfield, Digg is arguably "dumber" than Slashdot. Try the same argumentation on Slashdot vs. MySpace.

    When I talk about my hobby or profession, I like to single out the 99% that doesn't understand a word from what I'm saying.
  • DoubleClick (Score:3, Insightful)

    by Mike_K ( 138858 ) on Monday July 23, 2007 @02:59PM (#19959777)
    CmdrTaco wrote:

    Equally perplexing is the accounting of iframes. Let's look at someone like double click's alexa rating. Now it's hard to say, but I don't think I've ever visited their website. Have you? But according to Alexa, they have nearly a 1% share of the internet. I'd tend not to believe it...

    That's not surprising to me at all. I don't think this is because of all the iframes that pop up on pages, or they would have a much higher percentage than 1%. I think it's actual ad clicks. When you click an ad, you go to a doubleclick link which will redirect you to the advertiser's page. If all those ad clicks are counted as actual traffic, 1% is actually a very believable figure.

    And I've never heard of Alexa until now :)

  • by whitehatlurker ( 867714 ) on Monday July 23, 2007 @03:20PM (#19960085) Journal
    Agreed, the best way to see what they're collecting is to watch the stream and actually see it. However, the real question isn't what they collect, but what do they do with it? It's fairly obvious that they send back all of your browsing habits.
  • Re:MOD PARENT UP (Score:3, Insightful)

    by neoform ( 551705 ) <> on Monday July 23, 2007 @04:46PM (#19961353) Homepage
    It's not alexa's fault that they're logging in as you and spidering pages that robots.txt says not to spider? Seems to me that it's very much their fault.
  • by macraig ( 621737 ) <> on Monday July 23, 2007 @05:15PM (#19961767)
    This is for CmdrTaco and anyone else who wants to read it.

    Dude, it's the paradigm that sucks, not Alexa per se. Consider Nielsen ratings: would you or any self-respecting Slashdotter actually be so foolish as to agree to be a "Nielsen family"? I doubt it. It's the same dynamic at play. I blogged about the relative stupidity of Nielsen families in particular a while back; those people are ruining my ability to enjoy quality programming like Firefly, Space: Above and Beyond, Keen Eddie, and countless others because of their mindless plebeian tastes.

    These are also the same people who often cause unreasonable pricing for consumer items, because they're too stupid to know when to vote with their dollars and just say "no". "$70 for a set of warmed-over LucasFilm Star Wars films that already turned a profit three times over? No problem, I simply *must* have them!"

    As a result, manufacturers set prices based on this same mindless demographic; those of us who are "smart" consumers, who could wrangle a better fairer price, are dragged along for the ride kicking and screaming.

    That's kinda what has happened here: you (CmdrTaco) are being dragged along kicking - and screaming - by all the Alexoids, and you don't like that any more than I like having Firefly yanked off the air.

    I'm quietly of the suspicion that national and especially online advertising is only a fraction as profitable as corporations think it is. I suspect if someone could do a truly objective cost-benefit analysis of mass advertising, like car commercials on TV, we'd find that it's actually costing money that is never rewarded in equivalent sales, and for which we're all ultimately footing the bill in the form of higher prices to pay down all that pointless advertising.

    Solving the "Alexa dilemma" just might require eugenics or some other speciation event.

I've finally learned what "upward compatible" means. It means we get to keep all our old mistakes. -- Dennie van Tassel