Forgot your password?
typodupeerror

Best website statistics package? 79

Posted by Cliff
from the better-than-a-hit-counter dept.
goodminton asks: "As the webmaster for a small but growing e-commerce site, I'm becoming increasingly interested in the quality of our site metrics. We currently use a Javascript-based counter that provides good but basic information, however, a recent Slashdot posting has me thinking the stats from our system may not be as accurate as we'd like. What do you think is the best website statistics package, and why?"
This discussion has been archived. No new comments can be posted.

Best website statistics package?

Comments Filter:
  • Google (Score:4, Interesting)

    by $exyNerdie (683214) on Friday May 26, 2006 @11:05PM (#15414384) Homepage Journal
    You might want to try this: http://www.google.com/analytics/ [google.com].

    It's free!!
    (you can register for the invite until it becomes publicly available)
    • Google Analytics seems to be excellent. All sorts of statistics are available. I use it, and I can't really grumble about anything.
    • by bugg (65930)
      Does google really need any MORE information about you and your website?

      I'm sorry, but I'm creeped out by the amount of data google already has on everyone, I don't need to let them watch who is visiting what on my websites as well.
      • Agreed. I'm fine with google collecting data from their own websites, but I hate seeing "Waiting for google-analytics.com..." at every website that I visit. It creeps me out for some reason. To that end, I installed the NoScript extension and haven't looked back. I learned that 99% of javascript on the web is doing something to benifet the site owner (rather than me, the visitor), so I've turned javascript off globally and only turn it on for sites that use it for something good. NoScript is powerful e
        • by bugg (65930) *
          Big Brother might not be watching, but Google sure is.

          And "Big Brother" -- say, the NSA, is most probably watching Google. I mean, assuming that anyone at NSA has any clue at all, don't you think they know as much as Google does?

          You want to install a network tap that gets the most interesting data and is easily analyzed? Install taps on google's uplink providers--- assuming the lower tech solution s(getting someone at google to give you access to the data, getting an inside person at google, rooting g

    • by Anonymous Coward
      First, google now has very detailed info on your customers. Those customers like me who do a view source and see if you are providing such info to third parties without telling me will leave your site and find somewhere else to shop. Also, anyone with a brain has google analytics blocked in their hosts file:

      $ grep goog /etc/hosts
      0.0.0.0 www.google-analytics.com
      0.0.0.0 google-analytics.com
      • Finally. Someone other than me has a brain. People need to stop voluntarily bugging their websites. Maybe Google isn't evil today. I suppose that, at one point, Microsoft wasn't either. And suppose Google never becomes evil... the recent subpoena attempt made by an increasingly-nosy government highlights the danger of inviting even the most innocent corporation to share your visitors' most intimate secrets.
  • by perlionex (703104) * <josephNO@SPAMganfamily.com> on Friday May 26, 2006 @11:08PM (#15414390) Homepage

    For those subscribers using Slashdot's new discussion system, this link [slashdot.org] will work better.

    From the posting, though, I don't understand why you think your (Javascript-based) stats would be inaccurate, though, since only about 1.34% of users disabled or did not support Javascript.

    That said -- I personally use Analog [analog.cx], and although it does give some fairly useful statistics such as search engine terms, most popular directories, referers, etc., I don't find it gives me a very high level of insight into surfing habits. A log analysis tool such as that may be a good starting point for you, though, if you don't currently do analysis of that sort.

    • Ditto parent's comments. Analog is an oldie, but goldie - provides all the basic functionality you need and does it via log file analysis ... so no tweeks required on your HTML, nor dependant on Javascript enabled in the browser. Not all the bells and whistles of the newer stuff, but a great way to start.
  • by Toveling (834894) * on Friday May 26, 2006 @11:08PM (#15414391)
    Webalizer [mrunix.net]. Just feed it some nice Apache logs, and let it do the talking. Or, if you're less of the command-line guy, I've heard Google Analytics [google.com] is great.
    • Webalizer is very useful: we recently set up a new web site, and the information it provides has been handy for tweaking. It doesn't seem to provide everything we could want - there's no obvious way to gauge the relative popularity of different links on a given page, for example - but it does provide an idea of relative browser popularity among our visitors, which pages are most important (or at least most visited), and other useful information.

      Of course, like all log file-based tools, it suffers from the

  • AWStats (Score:5, Informative)

    by GuruBuckaroo (833982) on Friday May 26, 2006 @11:09PM (#15414396) Homepage

    I just went through this process for my employer. While I like Google Analytics (and currently use it for my personal web pages), it's a bit more focused on e-commerce than I need - although that may be good for you.

    What I decided on was http://awstats.sourceforge.net/ [sourceforge.net]. It's got a pretty impressive feature list, and I like the look, and the sheer volume of data it can collect.

    One caveat - the current version (6.5) has a command-injection vulnerability when run in cgi mode (as opposed to statically-created pages), so watch where & how you install it.

    • Re:AWStats (Score:5, Informative)

      by eddeye (85134) on Friday May 26, 2006 @11:43PM (#15414506)

      What I decided on was http://awstats.sourceforge.net/ [sourceforge.net]. It's got a pretty impressive feature list, and I like the look, and the sheer volume of data it can collect.

      As someone who setup awstats for a high-traffic site last year, let me warn you -- beyond the available options, it ain't customizable. At all. The html generation is embedded in bits and pieces throughout their perl code. Some of the nastiest, speghettiest mess I've ever seen. They don't even use stylesheets for proper styling. If it does exactly what you want, then fine. But be forewarned: if your needs ever change, don't expect awstats to change with them.

      • They don't even use stylesheets for proper styling.

        The horror!

        Do they use tables and eat babies, too?

        • Re:AWStats (Score:3, Informative)

          by gbjbaanb (229885)
          So awstats is not configurable. That's not necessasarily a bad point - nearly everyone wants awstats on their sites, and they're happy with the look of it out of the box. But people should be aware that they cannot change it. 99% of the time, this is an absolute non-issue, it gets installed, works, looks pretty. Job sorted.

          For the 1% of people who would like to change it, well, they should be aware that it isn't going to be for them, before they start working with it. Again, this is not that big an issue.

          Fo
      • "beyond the available options, it ain't customizable. At all."

        Sure it is...in the header part of the site: GNU GPL

    • Webalizer also does a great job of parsing your logfiles and producing graphs and charts:

      http://www.mrunix.net/webalizer/ [mrunix.net]
    • Re:AWStats (Score:2, Informative)

      by hunte (455338)

      One caveat - the current version (6.5) has a command-injection vulnerability when run in cgi mode (as opposed to statically-created pages), so watch where & how you install it.

      Right: simply put awstats behind an .htaccess (http://www.javascriptkit.com/howto/htaccess3.shtm l [javascriptkit.com]) and you are pretty safe. I use awstats also under windows+iis (it's needed only activeperl to run under win32) and it's rock.

      Bye.

  • None (Score:5, Interesting)

    by Bogtha (906264) on Friday May 26, 2006 @11:10PM (#15414397)

    If you are trying to find out how many people are visiting your site, or how popular particular browsers are, just give up now. No stats package can tell you that. Some pretend to, but it's snake oil.

    The basic problem is that not only are you fighting against the basic nature of a stateless protocol, but the things that skew your numbers (proxies, caching, etc) skew your numbers by an unknowable amount. Some things inflate your numbers, some things hide visitors from you. They don't cancel each other out like some people tell you (just think about it). In some cases, your visitors might not even communicate with your server at all.

    Web statistics are good for measuring server load and monitoring things like search terms people use to find your site, inbound links from referrers, etc. What you will find is that you can install any old stats package, and it will give you lots of pretty charts and numbers, but at the end of the day, you might as well make the numbers up, because they don't reflect reality. And yet for some reason, people still like having them, even when they know the numbers are totally wrong. I have yet to figure out why.

    • Re:None (Score:3, Funny)

      by Jester998 (156179)
      And yet for some reason, people still like having them, even when they know the numbers are totally wrong. I have yet to figure out why.

      These metrics empower us by quantifying the effectiveness of our economic paradigm, and allows us to leverage synergies with other business divisions. Furthermore, we can collect empirical datums, which may allow us to project customer interactions with our portal, and allow our business methods to expand dynamically going forward. Oh, and it'll help with that Web 2.0 thi
      • Oh, and it'll help with that Web 2.0 thingy project...

        I had no idea "thingy" was a management buzzword.

        I definitely need to get out more.
        • Oh, and it'll help with that Web 2.0 thingy project...

          I had no idea "thingy" was a management buzzword.

          I definitely need to get out more.

          Just wait until NEXT year, when they come out with Thingy 2.0 to compete with Apple's iThingy

        • I didn't actually use 'thingy' strictly as a management buzzword -- it was a reference to concepts usually seen in tech cartoons like Dilbert or User Friendly (i.e. that management doesn't actually have a clue about the technologies they want to implement).
    • And yet for some reason, people still like having them, even when they know the numbers are totally wrong. I have yet to figure out why.

      Trends.

      • Webtrends, haha, NetIQs package gives lots of great info in regards to trends although it is quite expensive. It's amazing the parent completely missed the whole idea of web log analysis like that.
      • Re:One word (Score:3, Insightful)

        by Bogtha (906264)

        Sorry, no. Let's say that AOL tune their caching parameters and all of a sudden a hundred thousand of your visitors get a page from AOL's cache instead of from your server. The "trend" will show a massive decrease in visitors, even if the number of visitors you have remains static.

        Looking at the difference between two incorrect numbers will not result in a correct number.

        • Sorry, no. Let's say that AOL tune their caching parameters and all of a sudden a hundred thousand of your visitors get a page from AOL's cache instead of from your server. The "trend" will show a massive decrease in visitors, even if the number of visitors you have remains static.

          Sorry, but yes. You can easily account for large events like this when you see them. If these changes happen to coincide with some recent marketing campaign it can be tricky, of course, to analyze the source of the variances,

          • You can easily account for large events like this when you see them.

            That's just it - events like this look identical to a drop in visitors. So "when you see them" never applies, because you don't know when you are seeing an event like this and when you are simply seeing fewer visitors.

            And even if you could tell when an event like this happens - how are you going to account for them? You don't know how much they are affecting your numbers, because (for example) a single cache could be serving your

            • That's just it - events like this look identical to a drop in visitors. So "when you see them" never applies, because you don't know when you are seeing an event like this and when you are simply seeing fewer visitors.

              The only thing you are demonstrating with your comments is that you haven't got the faintest clue about what you are talking about. In any real web company, the marketing and ad sales and data mining people go frantic at ANY change that isn't predicted in advance, and spend a lot of time a

              • In any real web company, the marketing and ad sales and data mining people go frantic at ANY change that isn't predicted in advance, and spend a lot of time and effort ensuring they understand exactly why, and how to compensate for those effects

                I think you are a little disconnected from reality here. The vast majority of "real" web companies are tiny. They certainly don't employ data miners, and the marketing and ad sales people - assuming that there are full-time employees handling this - have enou

            • That's just it - events like this look identical to a drop in visitors.

              No they don't. It's very easy to determine whether AOL has suddenly proxy-cached your site, and to differentiate that from a sudden fundamental drop in the actual number of visitors arriving at your site from AOL. There are both technical (meaning carefully crafting your site to collect better raw data) and mathematical (meaning sophisticated analysis of existing historical, current, and ongoing future data) approaches to this. Even

              • it's clear to me now that you've got an agenda

                Wow, talk about paranoid. It's easy to see why pro-stats packages people might have an agenda (e.g. they might work for a company selling snake oil), but what could I possibly gain from criticising the results?

                This is probably the most ridiculous response I've had so far. Let's sum up the responses I've had:

                • I'm wrong because I have an agenda.
                • I'm wrong because I'm trolling.
                • I'm wrong because people make decisions based on the numbers.
                • I'm wrong bec
                • It's easy to see why pro-stats packages people might have an agenda (e.g. they might work for a company selling snake oil), but what could I possibly gain from criticising the results?

                  Your agenda, apparently, is to argue. Some people thrive on this. Perhaps to prove a point, perhaps to feel important, perhaps because you feel ripped off or betrayed or left out as the only one who can't see what others are seeing. We're actually trying to help, but your argumentative style is getting in the way. If you

    • by ChaosDiscord (4913) * on Saturday May 27, 2006 @12:02AM (#15414568) Homepage Journal
      Just because the information you get is flawed doesn't mean the information is worthless. Most data is the real world is deeply flawed, and yet useful information can be extracted, useful trends determined. Sure, your log files will be skewed by who choses the participate (That is, who isn't caught by caches and proxies. If you're using Javascript, who is allowing the javascript in question). But any survey is skewed by those who chose to participate.

      Throwing your hands up in the air and declaring that because you cannot be sure it's all garbage is foolishness. Know the limitations of your tools, accept the error, and take what you can get.

      • Just because the information you get is flawed doesn't mean the information is worthless.

        Perhaps you missed it, but I covered this in my original comment. It's not simply that the information is incorrect, it's that you have no way of knowing how incorrect it is. It is that which makes it worthless. You could be a few dozen visitors out or you could be a million visitors out. No useful conclusions can be drawn when the error of margin is unknowable.

    • Er, nope (Score:3, Interesting)

      by cliveholloway (132299)
      We use Urchin - now "Google Analytics". Unless you want to delete cookies every page hit, and use the Web Developer Firefox plugin to remove hidden fields for every form submission, we pretty much have you tracked. This isn't 1995 y'know...
      • Re:Er, nope (Score:2, Interesting)

        by mabinogi (74033)
        > Unless you want to delete cookies every page hit, and use the Web Developer Firefox plugin to remove hidden fields for every form submission, we pretty much have you tracked. This isn't 1995 y'know...

        or just completely block *.google-analytics.com because urchin is the single most annoying thing on the internet.

        I'm so sick of waiting for pages to load, only to see "contacting google-analytics.com" in the status bar.

        It's the one thing that made me install the adblock extension. I don't care if you're t
        • It's the one thing that made me install the adblock extension. I don't care if you're tracking me. I do care if you're ruining my browsing experience.

          Ditto... For me it wasn't so much the delay, as it was the fact that if you selectively allow the slashdot authentication cookies, then block their urchin cookies, it forcably logs you out on all following pages... Most annoying for my slashdot browsing experience... (Yes, I did just say that my slashdot addiction made me block google analytics.)

          Interestingly,
        • Nope, we host urchin (we bought it a while back - version 4, I believe). It's all local, the js, cookies etc. Unless you want to start selectively deleting individual cookies after each page visit, there's not much you can right now.

          I don't think it will be long until there's a cookie wildcard blocker available - like adblock, but for cookies. But, I'm sure that when that arrives, the analytics firms will just start creating randomly named cookies that would pass such filters - not an enormous task. Or just
      • We use Urchin - now "Google Analytics". Unless you want to delete cookies every page hit, and use the Web Developer Firefox plugin to remove hidden fields for every form submission, we pretty much have you tracked.

        Funny that you mention this... I use Firefox with the Adblock [mozilla.org] extension, and here are some rules found in my preferences:

        *.google-analytics.com
        *urchin.js

        This works quite well for me. I added these rules several months ago, when urchin started appearing on some web pages and making t

    • Let me just start by saying, I'm no expert on web statistics. But, i am a web developer.

      Assuming that I:

      * Only care about PEOPLE (not robots) visiting my site
      * Use a javascript tracker
      * Account for a margin of error (Only 99% of traffic to my sites use JS)


      How can my statistics be scewed by cacheing, as the individual people will still report statistics by JS regardless of the cache. Plus you can get a good measure of stat scewedness by comparing JS tracking with log analyzing.

      Anyways, I'm o
      • Well I can spot two problems with that approach off the top of my head. Firstly, how did you establish that only 1% of your visitors do not have JavaScript available, and how do you meaure this on an ongoing basis? Secondly, JavaScript is not a binary yes/no question. There are different levels of support for JavaScript. The real statistic you need to worry about is how many of your visitors support the level of HTML and JavaScript necessary for your tracker to operate properly.

        Anyways, I'm of the

  • Sawmill rocks (Score:5, Informative)

    by toybuilder (161045) on Friday May 26, 2006 @11:13PM (#15414404)
    Sawmill [sawmill.net] is an awesome slicer-and-dicer of your web logs. I haven't done web stuff in several years, but the package was awesome five years ago, and it looks like they've been refining the product over the years.

  • by madstork2000 (143169) * on Friday May 26, 2006 @11:52PM (#15414538) Homepage
    If you can get invited to analytics it really rocks. Awstats is good, and I have always been fond of webalizer. I run a small hosting company, and I have found that awstats and webalizer can be a bit processor intensive under certain conditions. The nice thing about analytics is that the processing takes place off site. Analytics also has a lot more information geared toward marketing, and the metrics that can help make marketing decisions. Awstats and webalizer and especially webalizer are more about presenting data from the logs.

    -MS2k

  • I'm a big fan of AWStats [sourceforge.net]. It primarily gets its stats from parsing your access_log, but it also includes a javascript portion you can elect to use if you're interested in collecting more detailed information about your visitors (screen resolution, flash versions, etc.).

    One caveat, though, if you choose to implement AWStats is that you should keep it in an access-restricted area of your webserver. There have are some pretty nasty vulnerabilities in AWStats. As long as you keep it secured, you should be fi
  • BBClone + Webalizer (Score:5, Informative)

    by Shawn is an Asshole (845769) on Saturday May 27, 2006 @01:28AM (#15414734)
    If you're using PHP, you need to give BBClone [bbclone.de] a try. Just do an include from your scripts and it's good to go. The stats it generates are quite nice. I also use Webalizer on the server logs.
  • by jginspace (678908)

    As the webmaster for a small but growing e-commerce site...

    You say *small* so I say do your own. My code that collects and saves the actual data is less than a hundred lines of very simple PHP.

    When you're just beginning you want to follow each individual session. Most of the packages just aggregate data; they don't give info on a session level. You want info on *pages* viewed; not gifs and css and what-not

    Other advantages:

    * Send yourself a jabber message every time special things happen. Just a case o

  • web mining (Score:1, Interesting)

    by Anonymous Coward
    For me the most interesting feature of a statistics package is being able to do web mining. Very few do this, I only know the one I am using ( metriserve web analytics [metriserve.com]). Basically it allows you to find hidden links between pages of your site even when they do not directly link to each other. This gives interesting results on one of my pr0n sites. You would not believe the hidden relations you can find between models, poses, etc as surfed by my visitors. I am sure the data could actually be used for an intere
    • It's a shame it's that costly, i would need to take that 1million pageviews/month for a single alone, and it does not generate that much revenue to justify it...

      Altho, for an another website it might just work, low traffic, but higher revenues per pageview...

      Anyways, this is out of the reach for most people because of the price :(
  • The google one is supposedly good and it doesn't cost anything.

    But frankly, I'd rather not let anyone hold all the details of my site, even Google who of course want it for the mass of applicable marketing data. And the concept of linking to foreign scripts on my site isn't too tasty. Really, you should go for a SERVER solution, that you run yourself. Apache can give you a wealth of information already. Posting every users details every minute to Google is just crude.

    I've read about this - if you're some t
  • I think you might like visitors: http://www.hping.org/visitors/ [hping.org]
  • I ran AWstats some time but too many security issues. Now I am using Tracewatch at http://www.tiouw.com/ [tiouw.com]. Free, open source and great. See http://www.tracewatch.com/ [tracewatch.com] Nice features are that you are able to follow individual visitors through your sites, find most common paths etc.
  • I used phpmyvisites [phpmyvisites.net] before, and it isn't too bad - setup was a breeze, it gives good stats. but I've moved to google analytics now.
  • The company I used to work for, a top 10 Dutch e-commerce business, used Onestat [onestat.com]. The marketing managers went crazy over it. They liked it a lot.

    On the downside: it was expensive and sometimes the backend was very slow.

    On the upside: it was possible to measure the visitors on a very low as wel all high level.
     
  • I really like the Visitors package (http://www.hping.org/visitors/ [hping.org]). It does a really good job identifying multiple hits as a single visitor (if timestamp, user-agent, etc match). It also have some very good summaries of Google hits.
  • ClickTracks [clicktracks.com] has some interesting features geared towards visitor behavior. However, I've only just started using it but I have some doubts about the accuracy of their numbers. It also is missing some of the basic information you would expect from a traditions web stats program.

    I was pretty impressed with our demo of NetTracker [sane.com] but it requires some serious cash if you have a busy site.
  • by Ankh (19084) * on Saturday May 27, 2006 @07:51PM (#15418285) Homepage
    As with most things, it's not really that one package is "better" than another so much as that one might be more useful to you at any given time.

    I use my own package when a Web site is smaller (say, below a million hits per month) because I would rather sample some actual sessions and see where people went and what they were searching for than get an overview. If you see people are searching for Argyle Socks and are finding your page about the Duke of Argyll, you might want to add an extra page and link to it, "if you were looking for...".

    The statistic you most want is the things people looked for that might have reached your Web site and didn't, and that's the one you can't easily find!

    For a site getting under 1,000 hits per day, look at the server logs in detail at least once a week, and make navigation easier, add more content where it looks promising, think about why some areas don't get traffic, etc etc.

    When you're getting 10,000 hits/day, unless most of them are for graphics, the data can become overwhelming. And if you're over 100,000 hits per day you probably need to go to the sorts of reports that give you a very broad overview.

    A link checker and a 404 report can be useful -- Cool URIs don't change! [w3.org]

    Oh -- for anyone interested, although I do have hololog [sf.net] set up on for example my words and pictures from old books [fromoldbooks.org] Web site (in a private directory, sorry), the sourceforge page doesn't have a download, mea culpa. If it looks useful to anyone I've shared copies of "hololog" in the past. It could do with some cleaning up, alas!

    Liam
  • I have to recommend summary.net

    It works great, is easy to set up and customize, and has lots of different kinds of reports available. Along with definitions for the different stat types (great for management to be able to understand what they're looking at).

    Plus the developer is very responsive.

  • by steppin_razor_LA (236684) on Tuesday May 30, 2006 @12:43AM (#15426939) Homepage Journal
    I've done web analytics implementations for smaller (i.e. $10M e-commerce sites) and larger (i.e. hundreds of millions PV/month) companies.

    I'm not much of a fun of log file based analytics systems. They are simply too much work to maintain from an infrastructure POV and caching wreaks havoc with the accuracy of the stats. I therefore recommend 1x1 transparent pixel based systems. If you insist on log file based systems, NetTracker and WebTrends make some decent products.

    Google analytics is a great package for smaller companies. It is free and offers a nice chunk of functionality. Caveat emptor -- you get what you pay for. When I audited my last employers GA e-commerce metrics against actual online sales, there was a substantial (I think ~10% error)! However it is still a good tool for understanding trends and issues w/ your analytics.

    Webside Story (HBX) and Omniture rule the high end market. It has been a while since I checked pricing, but I think you can expect to start out at the ~$10-$20K/yr range. Both of these products are excellent.

    Webside Story sells a lower end package (Hitbox Professional) that has limited commerce metrics but is also pretty decent and afforable. They have an enterprise system: HBX that is excellent.

    Omniture also has am impressive system. I don't think they have much in terms of entry level offerings.

    Web trends has a product Web Trends live that is about 1/2 the price of the enterprise products from Webside Story and Omniture. It has been a good 5 years since I've their product, but I wasn't especially impressed with it at the time.
  • For me, you really can't beat a bit of grep, awk, wc and other bits of shell jiggery-pokery. I don't feel the need for webstats beautifiers, although they do have their place. With the vulnerabilities in awstats I wouldn't touch it with somebody else's barge pole these days, which is a shame because I used to really like the look and feel of it.

    Analog and webalizer, from ports, might get used on some deployments, from time to time, but that is probably as far as it goes. Hit the console and be your own log

That does not compute.

Working...