Tutorials - CGI (generic) > Setting up Webstats - Awstats
Tutorials and FAQs: CGI: Setting up Webstats - AwstatsIn this tutorial, I will explain how to install and configure one of the well-known webstats applications currently available: Awstats.
In a related tutorial I also explain how to install and configure one of the other popular webstats applications: Webalizer. It is possible to run both webstats packages on your cgi webspace as I will exlain later so it is possible to use both sets of web statistics reports and choose which one you prefer. But first, let's get some of the basics out of the way.
Note: It is not possible to install/use Awstats if you have a Home Surf or Biz Surf account. This is because they are very basic accounts primarily designed for websurfing accounts and do not come with any CGI webspace (which is necessary to process the webstats data) or have the option to generate webstats logfiles. Before you can do anything and if you have not already done so, you must activate your CGI webspace and understand how to access your CGI shell using telnet or ssh. Info on doing this can be found in CGI/Shell Server Basics.
What are webstats and where are they stored
Webstats are raw details about what pages and areas of your website people have visited. Each time your browser requests an html webpage or a graphic image from your website, a record of that request is stored by the webserver in an access or webstats log. It is then possible to collate all those requests using a logfile analyser and be able to view those statistics in graphical or condensed numerical form and thus, have some feedback on the popularity of your website and what parts are attracting or not attracting visitors.
Because the cgi and www servers are completely separate systems, a separate webstats log file is generated for each server. The raw access logs files are generated on a daily basis and are stored in a special directory called logs in your www webspace. The www webstats log files are named www.username.plus.com.gz and the cgi webstats are named cgi.username.plus.com.gz (with username being your name). The gz indicates the files have been stored in a compressed or gzipped format, this is to save space and to speed up file copying as some of the very popular websites can produce 10s of megabytes of raw data each day. If you have any domains registered under your username, it is possible to get webstats generated for those as well with the corresponding names of www.domainname.xxx and cgi.domainname.xxx (xxx being .co.uk, .com, .uk.ltd etc) and what server the domain is linked to. Upto 8 days of webstats are available at any one time in the logs directory (todays and the previous 7 days). When a new days worth of webstats is available, it is written as www.username.plus.com.gz, with each of the previous days logfiles renamed to www.username.plus.com.0.gz, www.username.plus.com.1.gz -> .6.gz (0 = yesterday, 1 = day before yesterday and so on).
Because the webstats logfile processing can only be done on your CGI server, it is not possible to process the webstats files directly from the logs directory, so they must be copied to your cgi webspace first. To simplify this operation I have written a couple of scripts that will do this for you. As the webstats are only created once per day, the copying must also be done daily prior to being processed so I will also explain how the scripts can be run automatically each day using crontab.
By default, webstats logfiles are not activated for your webspace. To enable them you must click on the web stats link under My Account and click the activate button. The domains that webstats are available for is shown in the View Your Webstats table. To view the default webalizer stats just click on the individual domain names listed. Note: you will need to wait at least 24 hours after activating webstats before any webstats logfiles will be available to view via the links in the table or to process yourself.
Awstats - A logfile analyser
Awstats is what is commonly refered to as a logfile analyser. It is able to read and process the raw webstats log files generated by the webserver and show the stats in a user friendly and visual way using graphics and tables. Awstats can also process other types of logfiles like mail, ftp etc but this is outside the scope of this tutorial.
The following link will show you an example of what Awstats can produce and it represents the stats for the tutorialsteam www webspace:
http://cgi.tutorialsteam.plus.com/cgi-bin/awstats/awstats.pl?config=www
Awstats installation and configuration
Unlike Webalizer, Awstats is not installed on the CGI server so it must be downloaded and installed in your cgi webspace. In the following sections I will explain how to install and then configure Awstats to run on the CGI servers.
Note: I normally use a unix text file editor called vi (or vim) to edit files as this is the simplest editor available on *nix. If you are not familiar with vi and how it works, a simple guide to commands can be found at the end of CGI/Shell Server Advanced Topics. If you prefer to edit the files using notepad or some other editor on Windows you can do so. You will need to ftp the files to edit to your windows PC, make the changed and ftp the files back to their original locations. There are some important differences between Windows and Unix text file formats so please make sure you read CGI: Unix / Windows text file compatability to stop any problems when running the perl scripts after copying then back to the CGI server.
Awstats downloads are available from http://www.awstats.org. The latest released version at the time of writing is V6.0 and consists of a compressed tar image called awstats-6.0.tgz. This or the current released version must be downloaded to your cgi webspace. The easiest way is probably to download the file from the website via your browser, save the file to your local disk then use FTP to put it in your cgi webspace home directory.
After you have copied it to your cgi home directory, you need to untar the files as follows:
| $ tar xvzf awstats-6.0.tgz |
This should then create a directory called awstats-6.0 (or awstats-6.X etc depending on the version you downloaded) containing the awstats files. For our purposes we now need to rename the directory to just awstats as follows:
| mv awstats-6.0 awstats |
We now have a starting directory called awstats from which to install and configure awstats.
Note: At this point, do not attempt to follow the install instructions included with the program - i.e. do not run configure.pl. Due to the way the CGI servers are configured, it is not possible to use the automated configure.pl script to install and setup awstats as that assumes you are a system admin. It must be done manually as detailed below.
Next, we need to create directories where the www and cgi stats will be kept, copy some directories and files to their correct places and download some additional scripts and files:
| $ cd awstats/wwwroot $ mv cgi-bin ../../cgi-bin/awstats $ mv icon js classes css .. $ cd .. $ rmdir wwwroot $ mkdir www cgi $ wget http://www.tutorialsteam.plus.com/cgi/webstats/aw_files.tgz $ tar xvzf aw_files.tgz $ chmod 700 *.pl $ chmod 604 *.conf $ mv *.conf ../cgi-bin/awstats |
The chmod 700 (rwx------) is to stop access to the files from other users, this is especially true for the perl (.pl) scripts as they will contain login and password information.
You should now have the following files in $HOME/awstats directory:
aw_getwwwstats.pl - perl script to copy www webstats logfile to cgi server and process it with awstats
aw_getcgistats.pl - perl script to copy cgi webstats logfile to cgi server and process it with awstats
awstats.cron - example crontab file for running above 2 perl scripts
both.cron - example crontab file containing entries for Webalizer and Awstats
And the following files added to $HOME/cgi-bin/awstats directory:
awstats.www.conf - awstats config file for processing www webstats
awstats.cgi.conf - awstats config file for processing cgi webstats
Next we need to configure the 2 perl scripts in the main awstats directory - don-t worry if you have never used perl, the changes are very simple:
| $ vi aw_getwwwstats.pl |
This script is very similar to the one used by webalizer to get the webstats logfiles from your www webspace so the settings will be very similar:
username - enter your username you use to connect via FTP
password - enter your password for FTP (this will be the same as your portal login)
domain - enter the URL of your www domain (www.username.plus.com)
configFile - The name of the awstats config file (www or cgi)
outputDir - The full path to the awstats/www directory where the stats graphics will be written
awstats_cmd - full path to the cgi-bin/awstats/awstats.pl script
Repeat the same edit for aw_getcgistats.pl except use cgi in place of www.
Next we need to edit the awstats config files for www and cgi:
| $ cd $HOME/cgi-bin/awstats $ vi awstats.www.conf |
Change the following entries:
SiteDomain - the domain name of your website (www.username.plus.com)
The other options that I have pre-configured for you (i.e. which are different to the default settings) are:
LogFile="../../awstats/awstats.log" (note this is not actually used)
LogFormat = "%host %other %logname %time1 %methodurl %code %bytesd %refererquot
%uaquot %virtualname"
DirData="../../awstats/www"
DirCgi="/cgi-bin/awstats"
DirIcons="/awstats/icon"
Repeat the same edit for awstats.cgi.conf but specify cgi.username.plus.com for [/b]SiteDomain[/b].
And that is pretty much it as far as the install and configuration is concerned. The next step as for webalizer is to process a webstats logfile and as with Webalizer, one must be available in your www logs directory.
To test the transfer of webstats logfile and processing run the following command, which should give something similar to the output, detailed below:
| $ cd $HOME/awstats $ ./aw_getwwwstats.pl www.tutorialsteam.plus.com WebStats analysis started at Wed Apr 14 19:00:02 2004 Last run Tue Apr 13 08:03:26 2004 (1081839806) Getting www.tutorialsteam.plus.com.gz (4664 bytes) Log file timestamp: Wed Apr 14 13:47:42 2004 (1081946862) Running awstats update... Update for config "/files/home2/tutorialsteam/cgi-bin/awstats/awstats.www.conf" With data in log file "/files/home2/tutorialsteam/awstats/www/www.tutorialsteam.plus.com"... Phase 1 : First bypass old records, searching new record... Direct access to last remembered record has fallen on another record. So searching new records from beginning of log file... Phase 2 : Now process new records (Flush history on disk after 20000 hosts)... Jumped lines in file: 0 Parsed lines in file: 324 Found 0 dropped records, Found 0 corrupted records, Found 0 old records, Found 324 new qualified records. Deleting log file... www.tutorialsteam.plus.com 1 files processed in this run www.tutorialsteam.plus.com WebStats analysis finished at Wed Apr 14 19:00:06 2004 $ ./aw_getcgistats.pl cgi.tutorialsteam.plus.com WebStats analysis started at Wed Apr 14 10:30:02 2004 Last run Tue Apr 13 08:34:07 2004 (1081841647) Getting cgi.tutorialsteam.plus.com.gz (5369 bytes) Log file timestamp: Wed Apr 14 09:09:40 2004 (1081930180) Running awstats update... Update for config "/files/home2/tutorialsteam/cgi-bin/awstats/awstats.cgi.conf" With data in log file "/files/home2/tutorialsteam/awstats/cgi/cgi.tutorialsteam.plus.com"... Phase 1 : First bypass old records, searching new record... Direct access to last remembered record has fallen on another record. So searching new records from beginning of log file... Phase 2 : Now process new records (Flush history on disk after 20000 hosts)... Jumped lines in file: 0 Parsed lines in file: 719 Found 0 dropped records, Found 0 corrupted records, Found 425 old records, Found 294 new qualified records. Deleting log file... cgi.tutorialsteam.plus.com 1 files processed in this run cgi.tutorialsteam.plus.com WebStats analysis finished at Wed Apr 14 10:30:11 2004 |
If everything worked correctly you should now have some stats graphics available to view using the following URLs (replace username with your own name):
http://cgi.username.plus.com/cgi-bin/awstats/awstats.pl?config=www for www and
http://cgi.username.plus.com/cgi-bin/awstats/awstats.pl?config=cgi for cgi.
Finally we can setup to run the perl scripts automatically every day so the stats will be updated. This is done using crontab, unix-s equivalent of the Windows Task Scheduler.
Note: If you intend to run both Webalizer and Awstats you must use the both.cron file in place of awstats.cron shown below. This is because running crontab filename replaces (deletes) any existing crontab entries with what is in filename.
I have already created a crontab file for you called awstats.cron. To add the commands to crontab just do the following:
| $ crontab awstats.cron (or crontab both.cron) $ crontab -l |
The first command adds the contents of awstats.cron and the second lists what crontab entries have been added. It should look like the following:
| $ cat awstats.cron 00 10,19 * * * cd $HOME/awstats; ./aw_getwwwstats.pl >> wwwcron.output 30 10,19 * * * cd $HOME/awstats; ./aw_getcgistats.pl >> cgicron.output |
For information on what the 30 10,19 etc mean see CGI: cron Task Scheduler. Basically it means run the command at 10:00 and 19:00 (1st line) and 10:30 and 19:30 (2nd line). It is necessary to run the command twice a day because sometimes the webstats may not always be available before 10:00 so the 2nd run catches any logfiles that are generated late.
To check that everything is working, look at the wwwcron.output and cgicron.output files to check the processing has occurred. At some time the .output files can be deleted to stop them using up a lot of disk space.
Adding additional domains to the webstats
This tutorial described how to create webstats info for your default www and cgi websites. If you have additional domains associated with your account, you can process webstats data for them just as easily.
- Create a separate directory in $HOME/awstats (e.g. mydomain)
- Make a copy of aw_getwwwstats.pl and call it something like aw_getmydomainstats.pl
- Make a copy of awstats.www.conf file in cgi-bin/awstats directory and call it awstats.mydomain.conf
- Modify the necessary parameters in the new .pl and .conf files to refer to the new directory and new domain name
- Add an additional entry into crontab to run the new .pl script at a slightly different time to the rest.
- The URL to the awstats webstats will be http://cgi.username.plus.com/cgi-bin/awstats/awstats.pl?config=mydomain
As before, run the new .pl command manually to make sure it works before adding it to crontab.
Further reading / information for Awstats
It is possible to configure Awstats to show different information via additional .conf file options. What I have given you is the basic config and is enough to get going. You may want to tweak the information to suit your purposes but this is beyond the scope of this tutorial. For more information on Awstats and what config options are available please visit http://www.awstats.org
That completes the setting up Awstats and also concludes this tutorial. I hope this have proved useful to you.
Original Article by: petervaughan - Edited by: csogilvie