Catch up on stories from the past week (and beyond) at the Slashdot story archive

 



Forgot your password?
typodupeerror

Which Web Statistics Package Would You Use? 83

Posted by Cliff
from the MRTG-anyone dept.
ken-doh asks: "We host about 200 customers web sites on a Windows platform, we want to provide them with a simple web statistics package, to track hits and other useful pieces of information. We have been using Deepmetrix LiveStats XSP which has been perfect for our customers, but since Microsoft purchased it, the product is no more, with support ending next year. So we need to buy a new stats package. Any ideas?"
This discussion has been archived. No new comments can be posted.

Which Web Statistics Package Would You Use?

Comments Filter:
  • awstats all the way (Score:5, Informative)

    by Salvance (1014001) * on Thursday November 16, 2006 @12:38PM (#16871916) Homepage Journal
    I'd choose awstats. It's fast, very easy to use, looks pretty, and best of all ... it's free to use on Windows as well as Linux. Here is their main page on sourceforge [sourceforge.net], which also includes a nice little demo.
    • by unity100 (970058)
      yea and it also occasionally has security holes that quickly compromise a server.
      • Agreed. It also fails to follow linuxy conventions enough to make it annoying to use very often. It's virtually unscriptable without a conf file for each domain and it's extraordinarily tough to "rename" confs (perl/sed FTW!). Finally, if something ever messes up and you miss a few days, you need to run each day sequentially. If you decided to do today's first, then yesterdays, it isn't going to work.

        So very unfriendly. So very, very insecure.

        I'd love to recommend webalizer, which excels in some of those ar
      • by Allador (537449)
        In typical use, awstats has nothing to compromise. It just emits .html files. There is no executable, or awstats code to access or compromise.

        Now if you're using a non-standard approach, and allowing people to hit the awstats cgi directly, then you suffer from this issue. But that only really works on small sites, due to performance issues.
      • by nullchar (446050)
        Agreed. It is best to place awstats behind .htaccess or some other non-public mechanism.
        Also, don't click on referral links in the web logs!
        The setup for multiple domains is a pain, but necessary with any other stats package I've used.
        • by badfish99 (826052)
          don't click on referral links in the web logs

          I'm not familiar with the package you're discussing, but anything that produces clickable links that it's dangerous to click on sounds to me like a disaster waiting to happen.
          • by pestie (141370) on Thursday November 16, 2006 @02:01PM (#16873260) Homepage
            It's not that the links are inherently dangerous. The problem is that clicking such a link will take you to the site the link points to (obviously), and your browser will dutifully report your referrer to the remote server. And if your referrer looks something like "http://www.example.com/top-secret-stats-directory /awstats-referrers.html" then you've just given some unknown server a "back door" into your web stats, allowing them to gather intelligence about your site. In many cases that's unimportant - either the site is an inconsequential personal web page, or the directory is password protected, or you're smart enough to use something [stardrifter.org] to prevent your browser from sending referrer information. But as we all know, many people don't do what they should, and sometimes little data leaks like this can lead to compromises.
      • by ednopantz (467288)
        And it provides almost no useful information! Want to know top referrers month to month? Sorry! We aggregated all the data and there's no way to do that. Want to know how many users have .NET 1.0, .NET 1.1 and .NET 2.0. Sorry! Don't track that. etc.

        Sure, its free and open source, but I'll take useful over free any day.
    • by Enoxice (993945)
      I agree wholeheartedly. It's so simple to use, but gives so much information.
    • by Volcane (27387)
      and without trickery, it doesnt give good daily detail view, lots of people care for daily detail to see how their ad optimisations are doing
    • I would recommend Google web master tools. If you use all features, you won't need any additional tool or site.
    • Sorry. I like AWstats, but it looks "pretty" in the sense that "neon green and neon pink make a bold colour scheme". Most of these reports are generated for marketing / management, so whilst you could argue the figures are what counts, their presentation is equally important - and this, sadly, is the area where most of the open source packages fail (note that many commercial ones do, too, but it's much more prevalent here) - they look garish, amateur, ugly, or all three.
    • by soliptic (665417)
      I donno man, I've got AWStats on one of my hosting accounts at the moment and it's frankly very very mediocre, compared to almost everything else I've tried -- some other online stats package I used to have with a different provider (whose name escapes me at the moment I'm afraid), some piece of freeware/shareware I downloaded that you run clientside (obviously you have to ftp the raw apache log files to your local pc too) - again I forget the name, I think it was something like WebLog Pro (dated from befor
  • why fix it? (Score:4, Insightful)

    by networkBoy (774728) on Thursday November 16, 2006 @12:38PM (#16871918) Homepage Journal
    If it ain't broke keep using it ;)
    or is that not an option for some reason?
    -nB
    • Re: (Score:1, Informative)

      by Anonymous Coward
      If it ain't broke keep using it ;)
      or is that not an option for some reason?
      -nB

      You missed something:
      " Microsoft purchased it, the product is no more, with support ending next year."
      • So it automatically quits working? I understand now.
        -nB

        on a completely unrelated note, /. should not have a 2 min timer for thread to thread posts. Should only be between posts in the same story.
        • Re: (Score:1, Flamebait)

          by cyber0ne (640846)

          So it automatically quits working?

          That's genuinely the advantage of running a Microsoft product :)

        • Re: (Score:3, Informative)

          by nahdude812 (88157) *
          In a manner of speaking, yes. I'm guessing they're using the hosted version of Livestats, so if they stop support, they are probably also taking down the hosted servers. The full non-hosted version is substantially more expensive (but cheaper if you do millions of hits a day like we do here).
  • by yagu (721525) * <yayagu@noSPaM.gmail.com> on Thursday November 16, 2006 @12:39PM (#16871928) Journal

    • pathological eclectic rubbish lister
    • awk
    • sed
    • one good staff member with a couple hours free time

    I've seen lots of different packages and frankly I sometimes wonder why people pay for them. They're typically (actually I guess they're literally) off-the-shelf stuff that, while offering nice and interesting features, don't cover everything for everybody. I think it's a "you get what you pay for mentality", i.e., people insist on buying packages to do this kind of analysis.

    I've written probably more than 20 different web filters for various analyses because the OTS stuff didn't get me the info I wanted.

    And for any more-than-small IT staff, there's always someone there who knows the tools, and can slap together stat info and tweak it ad nauseum until management sees the analysis they think they want. Lots of staff will even write it on their own time -- they like to tinker with that stuff.

    Also, though I haven't looked, I'll bet there are some great CPAN modules that get you what you want as a good start with the added benefit of having the code for your own tweaking.

    Considering the article specifically is asking for simple web stats, I think sed, awk, perl, and others is a perfect way to go.

    Or, you could buy yet another package and risk Microsoft buying that product and disappearing it.

    • by nahdude812 (88157) *
      The kind of stats you can get from awk & sed with a couple of hours of free time could be useful when you want to know some very specific things (such as "How many times was this file downloaded in the last two weeks") and already know the general shape of your data (eg, what pages are in your site).

      This would not give you statistics that are probably a lot more important, such as:
      o Conversion rates for your advertising (are we getting more out of advertising with company X than we pay for it?)
      o What yo
    • > one good staff member with a couple hours free time

      Be careful. I work in an environment that was full of those kinds of things. Almost none of the admins or programmers stuck to any standards, naming schemes, or left any documentation. Most didn't even speak to one another. Thus, when I came onboard, I had to deal with a statistics collection system in multiple pieces in several different languages; many at the same time (a bash script that ran Perl scripts AND some compiled C modules on Cygwin on
  • one to avoid (Score:3, Informative)

    by Anonymous Coward on Thursday November 16, 2006 @12:42PM (#16871990)
    Avoid SmarterStats. In fact, avoid the entire SmarterTools suite of products at all costs. Buggy. Poorly designed. Horrible!
    • by stg (43177)
      I don't really use SmarterStats, although it is included on one of my sites, so I can't really say if it's any good or not.

      I do use SmarterMail, and while the regular mail filters are neat, their spam filters are horrible! At least 10x more spam gets through than the default CPanel SpamAssassin (which is not so great compared to a nicely tuned one).

      I didn't try this in the current version, but in a previous version, if I tried to erase a hundred messages through their web interface, the server would stop re
    • by flonker (526111)
      I'd also recommend avoiding WebTrends. Where it would take 12-18 hours to process log files, Analog [analog.cx] runs in a fraction of the time (under an hour).

      My issues with Analog are that I haven't discovered how to make it only parse each log file only once, and I haven't discovered any way to have it display stats for different time periods (ie. daily/weekly/monthly/quarterly/annually) all on one page. I'm not sure if these are real faults with the program, or if I just didn't figure out how to do it yet,so YMMV.
  • Job done. If you get enough traffic, you can pay, but it's free for something like under 1M hits/mo. And their campaign/tracking tools "pwn" you.

    I'd put a link, but, c'mon. google.com
    • Re:Google analytics (Score:4, Informative)

      by CerebusUS (21051) on Thursday November 16, 2006 @01:03PM (#16872308)
      The 1M page view limitation was for beta only. The current product has no limits.

      Google Analytics is a good solution if you are looking for tracking based on javascript / web bug images.

      Since they are looking for something that works off the server log files (such as LiveStats) maybe they should look at Urchin [google.com] which runs locally and processes the log files. Google purchased Urchin to make their Analytics offering. Unlike Microsoft and LiveStats, however, Google still sells and supports the Urchin software through retail partners.

      Pricing is fairly reasonable, and is based on log sources and websites monitored. $895 buys you 100 profiles with one log source each. $695 extra per additional log source (i.e. if you've got 3 servers serving one website you'd need 2 additional log sources) regardless of the number of profiles. $695 extra per 100 additional profiles, as well.

      They also offer campaign tracking and ecommerce reporting modules.

      One thing that's impressed me about the program is the speed. We're using it on 150 profiles (with a maximum of 6 log sources per profile, though only one profile actually uses that many. Most of our profiles use only one log source) and it takes about 8 hours to process the logs each day from a central box using smb/cifs to pull the data files.

      • I completely agree.
        As a Webtrends refugee, I found Urchin to be a gift from heaven, it really rocks the house.
        However, Urchin (as of ver. 5.x) does rely on Javascript and bugs.
        • by ePhil_One (634771)
          That would be awesome, except Urchin IS Google Analytics. Google bought the package, and while it looks like they may have revived the paid version [google.com] for those who don't trust Google with their weblogs, I would be nevrous giving them my money.
        • by CerebusUS (21051)
          There is a component in Urchin that can make use of the javascript and webbugs, but it isn't required.

          It's also incompatible with Google's Analytics product. We regularly run the logs without the UTM (that's what they call the javascript piece) and instead setup the Google Analytics tracker instead. This lets us get historical data for sites and gives us better bandwidth usage stats for individual websites, while still allowing the end users to see the pretty graphs and awesome filtering Google gives them
    • I surf without javascript and have adblocked the analytics-domain. If you rely on google analytics, you won't see me. Log-based stats are way better.
      • by Aladrin (926209)
        On the other hand, maybe NOT tracking people like you is the more humane move? You obviously don't WANT to be tracked. Sure, it'll skew the information a bit, but in the end... Does it really matter?

        If most people went to the extremes you do for avoiding tracking, then yeah... I could understand the need to do the tracking locally. But most don't.
        • by Alphager (957739)
          Extremes? I have noscript, simply because i feel safer that way. I blocked google-analytics because it slowed down several sites i visit regularily. When you use google-analytics, how will know how many of your users don't use javascript? It's easy to lose a huge percentage of your visitors.
  • Anlog (Score:2, Interesting)

    by forrestf (1028150)
    http://www.analog.cx/ [analog.cx] that works well, atlest for my servers
    • by Tteddo (543485)
      Analog and Report Magic work great together. I have about 100 domains with it and never a problem.
    • analog has my vote too. I've used it for years. It's got to be the most controllable package around. There are options to tweak most everything, and to do pretty complex filtering, aliasing, etc. Admittedly, most people never need to know all that, but it's good that it's there for the times we do.
  • by chroma (33185) <chroma@mind s p r i ng.com> on Thursday November 16, 2006 @12:47PM (#16872070) Homepage
    If you want a simple logfile analyzer, use AWStats, as mentioned earlier here.

    Google Analytics [google.com] is a little more sophisticated tool that requires you to embed a little bit of their code on every one of your pages. Also free to use.

    For totally custom reporting, move your log data to the database following the guide I wrote earlier this year [kuro5hin.org].

    • by `Sean (15328)
      I second Google Analytics. I've been using it on about 40 Web sites and it's a great tool with zero configuration. Just drop in the Google code and go. The only concern is that Google now has aggregate traffic stats but I, for one, welcome our new Intarweb overlords.
  • So what's the difference between "hits" and "visits"? My website averages ~700 hits and ~80 visits per day. Which is more meaningful and/or significant?
    • by unity100 (970058)
      hits might be anything. visits generally contain actual 'people' with pcs.
    • Normally (although some like to blur the difference), a hit is an object load. It doesn't matter if that object load is by one IP or a billion IP's, each object load is one hit. A visitor is normally tracked by the only general unique thing that can be tracked per visit, i.e their IP. Ergo if you have 10 visits and 100 hits, that can infer that each visitor made 10 object requests.

      NeoThermic
    • by `Sean (15328)
      Visits. A Web page with 10 images is 11 hits. First hit is the page itself and then the next 10 hits are the images loading. If you have a stats package that tracks page views then that's the number you're really interested in. Say you have 100 visitors and 1000 page views; that means your content is compelling enough for each visitor to view an average of 10 pages before they leave the site.
    • Re: (Score:2, Informative)

      by fimbulvetr (598306)
      I'm not sure if you're trying to point out how useless webstats can be, but generally speaking most stats software I've worked with count a "visit" as an ip that it hasn't seen for the last 30 (sometimes configurable) minutes. If it sees it more than 30 minutes later, it's a new visit. If it sees an IP hitting the site within a 30 minute window, the hit counter gets bumped but the visit counter doesn't.
      • Re: (Score:2, Insightful)

        by nullchar (446050)
        Good point -- web stats can be amazingly useless. Don't read too much into them (oh gnos! I was crawled by wget!). And take the time to research what the numbers mean.

        Webstats can be useful for showing broken links (Why so many 404s for this file? Oh crap, Sally renamed it). They can also point out commonly mising files (robots.txt, favicon.ico, sitemap.xml or whatever). Web stats can also be used for optimization -- seeing 4000 hits with only 30 visits might mean you are using way too many images. (So
    • by daeg (828071)
      Hits, at least in most stat packages, are page views. Good stat packages won't inflate your hit numbers with things like CSS and image file hits.

      Visits are time-limited uniques. If I read 5 stories on Slashdot within an hour, that's 5 hits and 1 visit.

      Uniques are longer time-limited users. If I read 5 stories on Slashdot at 9 a.m., and 3 more at 4 p.m., that (at least in most packages) should be 8 hits, 2 visits, and 1 unique. Some packages will let you alter the timeline of Uniques, e.g., Weekly Uniques.

      Ge
    • It's actually much more complicated than most people think. The best write up I've seen is on Analog [analog.cx]'s site:

      This section [analog.cx] is about what happens when somebody connects to your web site, and what statistics you can and can't calculate. There is a lot of confusion about this. It's not helped by statistics programs which claim to calculate things which cannot really be calculated, only estimated. The simple fact is that certain data which we would like to know and which we expect to know are simply not availabl
      • I was about to post that same analog doc.

        That whole page is well worth reading.

        Many of the web stats packages other than analog really try to make you think they can get more data out than they really can.

        That page and the one above it (What the results mean [analog.cx]) should be required reading for anyone about to read a web stats report. I certainly send it to all my customers whenever I set them up with a report.

  • by Taimat (944976)
    Informant Advanced, and Cacti - do you really need anything else???
  • "More hits than germans surfing fetish websites
    Yo, that is a lot of hits."

    There's 45 hits in this song not counting the title.

    http://www.youtube.com/watch?v=B38c1e52vfY [youtube.com]
  • by Kvorg (21076) on Thursday November 16, 2006 @01:47PM (#16873014)

    Awstats seems to be the modern usual answer (http://awstats.sourceforge.net/ [sourceforge.net]), used and recommended by many admins and groups (in my case EGEE, European Science Grid intiative http://www.eu-egee.org/ [eu-egee.org]) but for traditionalists with no eye-candy desires, there is a copy of Webalizer (http://www.mrunix.net/webalizer/ [mrunix.net]) lurking on most servers and almost all destribution package repositories. It's worth looking at the wikipedia page for specials, extended verions and general info on web server statistics and analysis: http://en.wikipedia.org/wiki/Webalizer [wikipedia.org].

    Particularly, Stone Steps Webalizer is an interesting version of feature-full and candy-enabled version: http://www.stonesteps.ca/projects/webalizer/ [stonesteps.ca]. Others can be easily found on Freshmeat: http://freshmeat.net/search/?q=webalizer&section=p rojects [freshmeat.net] (i.e. Webalizer Extended with included Geolizer and extensive 404 analysis support, http://www.patrickfrei.ch/webalizer/ [patrickfrei.ch] and AwFull with usability, CSS and geo-ip features, http://www.stedee.id.au/awffull [stedee.id.au] etc.).

    Others can be found on Freshmeat (117 hits at this time http://freshmeat.net/search/?q=web&trove_cat_id=24 5&section=trove_cat [freshmeat.net]) and Wikipedia (very short and poor stub of a list that you might want to improve after your extensive testing :-) : http://en.wikipedia.org/wiki/Category:Free_web_ana lytics_software [wikipedia.org].

    There is also Sherlog, an Apache Log Analyser, specialized in user experinece tracking more than statistcs - an interesting complimentary tool (http://sherlog.europeanservers.net/ [europeanservers.net].

  • Hosted solution (you put a bit of javascript in every page, trivial if you are using some sort of templating system).

    Don't have to deal with web logs, always updated in real time, AMAZING functionality. Just pricey. our company found most of the open source or cheaper ones to be a bit lacking in functionality...just depends on your needs.
    • by Wilk4 (632760)
      then how does it record traffic by web spiders and those browsing with javascript disabled?
      • by matt_king (19018)
        If what you are looking for is how popular or useful your site is, web spiders are not really important. Since they are not real users, they really don't tell you much. If you need spider data, you can always slurp that in from logs but honestly, the business managers don't really care about that data in general. It actually handles users who don't use javascript by also including an in the header of the page so that if javascript is disabled, you can still track hits and paths.
        • by Wilk4 (632760)
          ah, thanks. Just wondered. I usually want the bots too for my webmaster's point of view, so Analog and regular logs are better or me. Good to know what else is out there though.
  • I looked at a couple of the popular ones, installed Awffull [freshmeat.net] and played with it for a bit. But it wasn't immediately
    obvious to me that any of the common ones supported aggregating stats across domains / hosts. Eg, I have 10 virtual servers on this
    Apache [apache.org] box, give me a sorted list of hits per domain/host. Probably one or more of the popular open-source stats packages [wikipedia.org]
    *does* do this, but I didn't feel like spending hours examining different ones and installing them. Since my needs were very basic
    I just wrot
  • Next time you'll know to choose an open source solution.

    With a proprietary solution, the customers are the ones who support the product, and are then shafted when the product is discontinued.
  • http://www.tracewatch.com/ [tracewatch.com] for a free analytic program that actually really helps. I have tried most of the above for my site http://usa.tiouw.com/ [tiouw.com] and none came even close. Analysis is useless if you cant trace exactly how a user is going through your site. It might mean more work for you, but the pay-off is there. I know for sure that your clients will love you for it and that it will mean more business.
  • Sawmill [sawmill.net] is an excellent package. It's easy to configure, has nice drill-down features and great reporting. I'm not associated with the vendor, just a satisfied user.
  • WebTrends 8.0 is nice.. if you have a dedicated windows server and deep pockets.
  • What problem are you trying to solve?

    This appears to me to be another project looking for a problem to solve. This is way too common in IT. Application cruft on the PC occupying memory, and generally bogging things down. Then the users complain that the network is too slow. Then you have to buy new hardware, and the new hardware needs new cruft... Lather, Rinse, Repeat.

    Of course, this could also be a "Review Fodder" project with the goal of adding a new line to your self-assessment paper to the boss.

  • There are two types of web analytics technologies: log analysis and page embedding. As you are providing statistics for 200 web sites (clients), you want to stick with a log analysis solution. As the two big boys (Microsoft and Google) are going to be providing FREE web analytics based on page embedding, many of the companies currently in the web analytics game will go out of business as half their market will be lost. That said, stick with a company that is big and has not put too many eggs in the hosted
  • by Doches (761288)
    I'm not sure if it quite fits your needs, but it's both fantastic and well-designed: http://www.haveamint.com/ [haveamint.com]
  • Switch to a LAMP stack and choose one of the many freely available analyzers for Apache on Linux. (Yeah, like you didn't see that coming...)

Take an astronaut to launch.

Working...