Utilities:Scripts:BrowserCounter

BrowserCounter 1.3.0

BrowserCounter 1.3.0 is a small (well, ok, not that small) Perl program (NOT a CGI) that scans any standard 'combined' (ECLF) format web log and produces a table summarizing what browsers people have used to access a web server.

You will need the GD::Graph module if you want the pie charts (available from CPAN or PPM repositories). The progam will still run without it - you just won't get any pie charts.

An example of a report is available. You can get the script (right-click to save it) here as well. A sample configuration file is also available. The configuration file is only necessary if you want to change the report layout or options.

If you like and use this program - let me know - I would love to hear about it.

Usage Synopsis

BrowserCounter - A program for generating web browser usage statistics (v. 1.3.0)

A fast, portable, and highly configurable web browser log analysis program for analyzing standard and non-standard format web log files.


Required:
Perl 5.6 or later
Optional:
GD and GD::Graph (if pie charts are wanted)
Time::HiRes (improves precision of performance metrics)


This program is licensed under the same conditions and terms as Perl itself.

This means that you can, at your option, redistribute is and/or modify it under either the terms the GNU Public License (GPL) version 1 or later, or under the Perl Artistic License.

See http://dev.perl.org/licenses/


By default it generates a report on the platorms (Mac, Windows, Linux, etc) used by web browsers, a summary of the 'brand' (MSIE, Firefox, Mozilla, Safari, etc), and summaries of the brands broken down by major and minor version. The default configuration is at the end of the script and is documented within it.

It performs extensive filtering of processed log entries to generate as nearly 'pure' results as possible. This means in practice that 90% of entries are excluded because of statistical biases caused by differential browser behaviors such as loading/not loading images, loading/not loading stylesheet, loading/not loading of javascript and ActiveX, as well as more subtle biases such as image loading sequence.

By default, it also exlcudes identifiable web crawling robots from the statistics.

These behaviors can be changed, if desired, by using a custom configuration file.

The program will generate pie chart graphs of the analyzed statistics if the 'GD' and 'GD::Graph' modules are installed (available from CPAN http://cpan.org/ and from various PPM repositories). If not available, the pie charts will be omitted from generated reports.

You can generate an initial configuration configuration file using the '--new_config_file' option: browsercounter.pl --new_config_file=example_configuration.conf

The configuration file contains documentation for its options along with a template that can be used to reformat the output report pretty much in any way you want.

The command line options for the program are as follows:

--usage or --help

Prints this usage message and exits.

--new_config_file=example.conf

Saves a new initial configuration file to the specified file.

--report_title="Browser Report"

This allows setting the report title. The default is 'Browser Report'.

--output_dir=/var/www/html/statistics/browser

The path to the directory you want to save the output report to. The default configuration is /var/www/html/statistics/browser

If the output directory does not exist the program will die with an error.

The --output_dir commandline option overrides any configuration file 'output_dir' specification.

--output_file=index.html

Specifies the name of the output report file. By default, index.html

The --output_file commandline option overrides any configuration file 'output_file' specification.

--config_file=/path/to/configuration_file

The path to a configuration file for controlling the report. The default is to an internally specified configuration.

Example: browsercounter.pl --config_file=/var/www/configs/browsercounter.conf

Any files listed on the command line after all options have been processed are assumed to be log files for processing and override the logfiles specified within the configuration.

The program can handle gzip (.gz), compress (.z), and bzip2 (.bz2) compressed logfiles as long as 'gzip' and 'bzip2' are in the PATH. You can add additional compression program support via the configuration file.

Multiple logfiles are supported both on the commandline or via the configuration file.

Simple usage example:

browsercounter.pl --report_title="My Report" --output_dir=/var/www/html/statistics --output_file=browsers.html /var/log/httpd/access_log

Note: BROWSERCOUNTER IS NOT A CGI SCRIPT. It is intended to be run periodically (say once a day, week or even month) via a system job such as 'cron' and generate static HTML pages for viewing.

You could probably modify it to work as a CGI with a modest amount of effort, but it is not a good idea: Log analysis is a resource intensive exercise and for logs of more than a few megabytes takes a noticable amount of time (a half-million line log takes roughly 30 seconds to process on a 3Ghz Pentium system).

<URL:http://nihongo.org/snowhare/utilities/browsercounter.html>