Best website statistics package? 79
goodminton asks: "As the webmaster for a small but growing e-commerce site, I'm becoming increasingly interested in the quality of our site metrics. We currently use a Javascript-based counter that provides good but basic information, however, a recent Slashdot posting has me thinking the stats from our system may not be as accurate as we'd like. What do you think is the best website statistics package, and why?"
Google (Score:4, Interesting)
It's free!!
(you can register for the invite until it becomes publicly available)
Re:AWStats (Score:3, Informative)
I'm a big fan of AWStats (awstats.sourceforge.net).
We got sucked in by the pretty graphs too. Internally, awstats is a mess. Some of the worst code spaghetti I've seen in awhile. As I already said [slashdot.org], I'm not optimistic of their ability to improve going forward.
Re:AWStats (Score:3, Insightful)
Re:Google (Score:1)
Re:Google (Score:1)
I'm sorry, but I'm creeped out by the amount of data google already has on everyone, I don't need to let them watch who is visiting what on my websites as well.
Re:Google (Score:2)
Re:Google (Score:2)
And "Big Brother" -- say, the NSA, is most probably watching Google. I mean, assuming that anyone at NSA has any clue at all, don't you think they know as much as Google does?
You want to install a network tap that gets the most interesting data and is easily analyzed? Install taps on google's uplink providers--- assuming the lower tech solution s(getting someone at google to give you access to the data, getting an inside person at google, rooting g
Keep in mind its flaws. (Score:1, Insightful)
$ grep goog
0.0.0.0 www.google-analytics.com
0.0.0.0 google-analytics.com
Re:Keep in mind its flaws. (Score:2)
New Discussion System (Score:4, Interesting)
For those subscribers using Slashdot's new discussion system, this link [slashdot.org] will work better.
From the posting, though, I don't understand why you think your (Javascript-based) stats would be inaccurate, though, since only about 1.34% of users disabled or did not support Javascript.
That said -- I personally use Analog [analog.cx], and although it does give some fairly useful statistics such as search engine terms, most popular directories, referers, etc., I don't find it gives me a very high level of insight into surfing habits. A log analysis tool such as that may be a good starting point for you, though, if you don't currently do analysis of that sort.
Re:New Discussion System (Score:2)
Webalizer or Analytics (Score:3, Interesting)
Useful tools, but log-files are flawed (Score:3, Insightful)
Webalizer is very useful: we recently set up a new web site, and the information it provides has been handy for tweaking. It doesn't seem to provide everything we could want - there's no obvious way to gauge the relative popularity of different links on a given page, for example - but it does provide an idea of relative browser popularity among our visitors, which pages are most important (or at least most visited), and other useful information.
Of course, like all log file-based tools, it suffers from the
AWStats (Score:5, Informative)
I just went through this process for my employer. While I like Google Analytics (and currently use it for my personal web pages), it's a bit more focused on e-commerce than I need - although that may be good for you.
What I decided on was http://awstats.sourceforge.net/ [sourceforge.net]. It's got a pretty impressive feature list, and I like the look, and the sheer volume of data it can collect.
One caveat - the current version (6.5) has a command-injection vulnerability when run in cgi mode (as opposed to statically-created pages), so watch where & how you install it.
Re:AWStats (Score:5, Informative)
What I decided on was http://awstats.sourceforge.net/ [sourceforge.net]. It's got a pretty impressive feature list, and I like the look, and the sheer volume of data it can collect.
As someone who setup awstats for a high-traffic site last year, let me warn you -- beyond the available options, it ain't customizable. At all. The html generation is embedded in bits and pieces throughout their perl code. Some of the nastiest, speghettiest mess I've ever seen. They don't even use stylesheets for proper styling. If it does exactly what you want, then fine. But be forewarned: if your needs ever change, don't expect awstats to change with them.
Re:AWStats (Score:1)
The horror!
Do they use tables and eat babies, too?
Re:AWStats (Score:3, Informative)
For the 1% of people who would like to change it, well, they should be aware that it isn't going to be for them, before they start working with it. Again, this is not that big an issue.
Fo
Re:AWStats (Score:1)
"beyond the available options, it ain't customizable. At all."
Sure it is...in the header part of the site: GNU GPL
Re:AWStats (Score:1)
http://www.mrunix.net/webalizer/ [mrunix.net]
Re:AWStats (Score:2, Informative)
Right: simply put awstats behind an .htaccess (http://www.javascriptkit.com/howto/htaccess3.shtm l [javascriptkit.com]) and you are pretty safe. I use awstats also under windows+iis (it's needed only activeperl to run under win32) and it's rock.
Bye.
None (Score:5, Interesting)
If you are trying to find out how many people are visiting your site, or how popular particular browsers are, just give up now. No stats package can tell you that. Some pretend to, but it's snake oil.
The basic problem is that not only are you fighting against the basic nature of a stateless protocol, but the things that skew your numbers (proxies, caching, etc) skew your numbers by an unknowable amount. Some things inflate your numbers, some things hide visitors from you. They don't cancel each other out like some people tell you (just think about it). In some cases, your visitors might not even communicate with your server at all.
Web statistics are good for measuring server load and monitoring things like search terms people use to find your site, inbound links from referrers, etc. What you will find is that you can install any old stats package, and it will give you lots of pretty charts and numbers, but at the end of the day, you might as well make the numbers up, because they don't reflect reality. And yet for some reason, people still like having them, even when they know the numbers are totally wrong. I have yet to figure out why.
Re:None (Score:3, Funny)
These metrics empower us by quantifying the effectiveness of our economic paradigm, and allows us to leverage synergies with other business divisions. Furthermore, we can collect empirical datums, which may allow us to project customer interactions with our portal, and allow our business methods to expand dynamically going forward. Oh, and it'll help with that Web 2.0 thi
Re:None (Score:2)
I had no idea "thingy" was a management buzzword.
I definitely need to get out more.
Re:None (Score:2)
Re:None (Score:2)
One word (Score:1)
Trends.
Re:One word (Score:2)
Re:One word (Score:3, Insightful)
Sorry, no. Let's say that AOL tune their caching parameters and all of a sudden a hundred thousand of your visitors get a page from AOL's cache instead of from your server. The "trend" will show a massive decrease in visitors, even if the number of visitors you have remains static.
Looking at the difference between two incorrect numbers will not result in a correct number.
Re:One word (Score:2)
Sorry, but yes. You can easily account for large events like this when you see them. If these changes happen to coincide with some recent marketing campaign it can be tricky, of course, to analyze the source of the variances,
Re:One word (Score:2)
That's just it - events like this look identical to a drop in visitors. So "when you see them" never applies, because you don't know when you are seeing an event like this and when you are simply seeing fewer visitors.
And even if you could tell when an event like this happens - how are you going to account for them? You don't know how much they are affecting your numbers, because (for example) a single cache could be serving your
Re:One word (Score:2)
The only thing you are demonstrating with your comments is that you haven't got the faintest clue about what you are talking about. In any real web company, the marketing and ad sales and data mining people go frantic at ANY change that isn't predicted in advance, and spend a lot of time a
Re:One word (Score:2)
I think you are a little disconnected from reality here. The vast majority of "real" web companies are tiny. They certainly don't employ data miners, and the marketing and ad sales people - assuming that there are full-time employees handling this - have enou
Re:One word (Score:1)
I said this in my first comment:
I never said that statistics are all bad, just that you can't derive certain information from them.
Re:One word (Score:2)
No they don't. It's very easy to determine whether AOL has suddenly proxy-cached your site, and to differentiate that from a sudden fundamental drop in the actual number of visitors arriving at your site from AOL. There are both technical (meaning carefully crafting your site to collect better raw data) and mathematical (meaning sophisticated analysis of existing historical, current, and ongoing future data) approaches to this. Even
Re:One word (Score:2)
Wow, talk about paranoid. It's easy to see why pro-stats packages people might have an agenda (e.g. they might work for a company selling snake oil), but what could I possibly gain from criticising the results?
This is probably the most ridiculous response I've had so far. Let's sum up the responses I've had:
Re:One word (Score:2)
Your agenda, apparently, is to argue. Some people thrive on this. Perhaps to prove a point, perhaps to feel important, perhaps because you feel ripped off or betrayed or left out as the only one who can't see what others are seeing. We're actually trying to help, but your argumentative style is getting in the way. If you
Flawed does not mean worthless (Score:5, Interesting)
Throwing your hands up in the air and declaring that because you cannot be sure it's all garbage is foolishness. Know the limitations of your tools, accept the error, and take what you can get.
Re:Flawed does not mean worthless (Score:2)
Perhaps you missed it, but I covered this in my original comment. It's not simply that the information is incorrect, it's that you have no way of knowing how incorrect it is. It is that which makes it worthless. You could be a few dozen visitors out or you could be a million visitors out. No useful conclusions can be drawn when the error of margin is unknowable.
Re:Flawed does not mean worthless (Score:2)
Such differences can be caused by any number of different things unrelated to the number of visitors you have. You are assuming that the effect of all the different ways in which your visitor counts are wrong stays constant from month to month. That's a very unreliable assumption.
Er, nope (Score:3, Interesting)
Re:Er, nope (Score:2, Interesting)
or just completely block *.google-analytics.com because urchin is the single most annoying thing on the internet.
I'm so sick of waiting for pages to load, only to see "contacting google-analytics.com" in the status bar.
It's the one thing that made me install the adblock extension. I don't care if you're t
Re:Er, nope (Score:2)
Ditto... For me it wasn't so much the delay, as it was the fact that if you selectively allow the slashdot authentication cookies, then block their urchin cookies, it forcably logs you out on all following pages... Most annoying for my slashdot browsing experience... (Yes, I did just say that my slashdot addiction made me block google analytics.)
Interestingly,
Re:Er, nope (Score:2)
I don't think it will be long until there's a cookie wildcard blocker available - like adblock, but for cookies. But, I'm sure that when that arrives, the analytics firms will just start creating randomly named cookies that would pass such filters - not an enormous task. Or just
Re:Er, nope (Score:2)
Funny that you mention this... I use Firefox with the Adblock [mozilla.org] extension, and here are some rules found in my preferences:
This works quite well for me. I added these rules several months ago, when urchin started appearing on some web pages and making t
Re:None (Score:1)
Assuming that I:
* Only care about PEOPLE (not robots) visiting my site
* Use a javascript tracker
* Account for a margin of error (Only 99% of traffic to my sites use JS)
How can my statistics be scewed by cacheing, as the individual people will still report statistics by JS regardless of the cache. Plus you can get a good measure of stat scewedness by comparing JS tracking with log analyzing.
Anyways, I'm o
Re:None (Score:2)
Well I can spot two problems with that approach off the top of my head. Firstly, how did you establish that only 1% of your visitors do not have JavaScript available, and how do you meaure this on an ongoing basis? Secondly, JavaScript is not a binary yes/no question. There are different levels of support for JavaScript. The real statistic you need to worry about is how many of your visitors support the level of HTML and JavaScript necessary for your tracker to operate properly.
Sawmill rocks (Score:5, Informative)
Re:Sawmill rocks (Score:4, Informative)
Re:Sawmill rocks (Score:2)
Sawmill also did a good job of analyzing our load-balanced set of web servers (allowing us to roll-up a set of combined stats).
AWSTATS / WEBALIZER / ANALYTICS (Score:3, Informative)
-MS2k
AWStats (Score:1)
One caveat, though, if you choose to implement AWStats is that you should keep it in an access-restricted area of your webserver. There have are some pretty nasty vulnerabilities in AWStats. As long as you keep it secured, you should be fi
BBClone + Webalizer (Score:5, Informative)
DIY (Score:2)
You say *small* so I say do your own. My code that collects and saves the actual data is less than a hundred lines of very simple PHP.
When you're just beginning you want to follow each individual session. Most of the packages just aggregate data; they don't give info on a session level. You want info on *pages* viewed; not gifs and css and what-not
Other advantages:
* Send yourself a jabber message every time special things happen. Just a case o
web mining (Score:1, Interesting)
Re:web mining (Score:1)
Altho, for an another website it might just work, low traffic, but higher revenues per pageview...
Anyways, this is out of the reach for most people because of the price
3 options (Score:2)
But frankly, I'd rather not let anyone hold all the details of my site, even Google who of course want it for the mass of applicable marketing data. And the concept of linking to foreign scripts on my site isn't too tasty. Really, you should go for a SERVER solution, that you run yourself. Apache can give you a wealth of information already. Posting every users details every minute to Google is just crude.
I've read about this - if you're some t
Take a look at visitors (Score:1)
Tracewatch is my favourite (Score:1)
Re:Tracewatch is my favourite (Score:2)
However.. it is not opensource, and it looks like it might be quite processor intensive.
Phpmyvisites (Score:1)
Re:Phpmyvisites (Score:2)
online service: Onestat (Score:1)
On the downside: it was expensive and sometimes the backend was very slow.
On the upside: it was possible to measure the visitors on a very low as wel all high level.
Visitors (Score:2)
ClickTracks and NetTracker (Score:2)
I was pretty impressed with our demo of NetTracker [sane.com] but it requires some serious cash if you have a busy site.
First think about what you need (Score:3, Interesting)
I use my own package when a Web site is smaller (say, below a million hits per month) because I would rather sample some actual sessions and see where people went and what they were searching for than get an overview. If you see people are searching for Argyle Socks and are finding your page about the Duke of Argyll, you might want to add an extra page and link to it, "if you were looking for...".
The statistic you most want is the things people looked for that might have reached your Web site and didn't, and that's the one you can't easily find!
For a site getting under 1,000 hits per day, look at the server logs in detail at least once a week, and make navigation easier, add more content where it looks promising, think about why some areas don't get traffic, etc etc.
When you're getting 10,000 hits/day, unless most of them are for graphics, the data can become overwhelming. And if you're over 100,000 hits per day you probably need to go to the sorts of reports that give you a very broad overview.
A link checker and a 404 report can be useful -- Cool URIs don't change! [w3.org]
Oh -- for anyone interested, although I do have hololog [sf.net] set up on for example my words and pictures from old books [fromoldbooks.org] Web site (in a private directory, sorry), the sourceforge page doesn't have a download, mea culpa. If it looks useful to anyone I've shared copies of "hololog" in the past. It could do with some cleaning up, alas!
Liam
summary.net (Score:2)
It works great, is easy to set up and customize, and has lots of different kinds of reports available. Along with definitions for the different stat types (great for management to be able to understand what they're looking at).
Plus the developer is very responsive.
Advice from the field.... (Score:3, Informative)
I'm not much of a fun of log file based analytics systems. They are simply too much work to maintain from an infrastructure POV and caching wreaks havoc with the accuracy of the stats. I therefore recommend 1x1 transparent pixel based systems. If you insist on log file based systems, NetTracker and WebTrends make some decent products.
Google analytics is a great package for smaller companies. It is free and offers a nice chunk of functionality. Caveat emptor -- you get what you pay for. When I audited my last employers GA e-commerce metrics against actual online sales, there was a substantial (I think ~10% error)! However it is still a good tool for understanding trends and issues w/ your analytics.
Webside Story (HBX) and Omniture rule the high end market. It has been a while since I checked pricing, but I think you can expect to start out at the ~$10-$20K/yr range. Both of these products are excellent.
Webside Story sells a lower end package (Hitbox Professional) that has limited commerce metrics but is also pretty decent and afforable. They have an enterprise system: HBX that is excellent.
Omniture also has am impressive system. I don't think they have much in terms of entry level offerings.
Web trends has a product Web Trends live that is about 1/2 the price of the enterprise products from Webside Story and Omniture. It has been a good 5 years since I've their product, but I wasn't especially impressed with it at the time.
The Shell (Score:1)
For me, you really can't beat a bit of grep, awk, wc and other bits of shell jiggery-pokery. I don't feel the need for webstats beautifiers, although they do have their place. With the vulnerabilities in awstats I wouldn't touch it with somebody else's barge pole these days, which is a shame because I used to really like the look and feel of it.
Analog and webalizer, from ports, might get used on some deployments, from time to time, but that is probably as far as it goes. Hit the console and be your own log