Analyzing Many Log Files

Popular sites have a problem: log files become very large very quickly! It is necessary to set them aside in some way when they become too large to be kept in uncompressed form or kept on the hard drive at all.

Most sites solve this problem by periodically compressing and setting aside old log files. This will present no problem for wusage, as long as wusage has already analyzed the data in the log file that has been set aside. Programs such as Unix compress, and DOS-oriented .zip utilities such as PKWare's PKZIP, are useful tools to compress and store old log files. It is suggested that administrators compress and set aside old log files immediately after running wusage, just after midnight each day or on Sunday for weekly analysis. Wusage will politely ignore data that has already been analyzed if it is present in a log file, but setting such data aside in another file does speed up the program.

"My server generates a new log file for each day, so I have dozens of log files already and more on the way. How can I analyze these logs with wusage?"

If you have many uncompressed log files, just use the logdir option to tell wusage where to find them all, and keep them all in a single directory until they have been analyzed. This feature is new as of version 4.1. You can keep them there indefinitely if you wish, but wusage runs faster if only the data not yet analyzed is present in the log directory. If your log files are compressed and you do not wish to uncompress them, skip ahead to read about coping with compressed log files.

"I have many mirror sites, so I have a collection of log files that all contain entries from the same period in time. Can wusage cope with this?"

As of version 4.1, the answer is yes. Just use the logdir option to tell wusage where to find the log files you have collected. If you have more than 20 log files that cover the same period in time, for instance 50 log files from 50 mirrors of the same site, wusage may slow down considerably. This does not occur with non-overlapping log files, even if there are hundreds of them. If your needs cannot be met with a limit of 20 overlapping log files, please contact us with information about your specific needs. (If the above link does not work for you, send email to wusage@boutell.com instead.)

"This is all well and good, but I already have several old compressed log files. How do I analyze them with wusage?"

The Unix cat and zcat commands are extremely useful to reconstruct a single log file. Even better, thanks to the -l option of wusage, you can avoid creating an actual combined disk file of your log entirely. Consider the following Unix command line:

cat oldlog1 oldlog2 oldlog3 | wusage -l -

The special filename - (a single dash) signals wusage to read its log entries from standard input, which is piped in from the Unix cat program. For more information about cat, try the Unix command man cat.

Important note: it is important to feed log files to cat in ascending order. An older log file should precede a newer one.

"What if I have compressed data?"

The following Unix pipeline will send several compressed log files to wusage:

zcat oldlog1.Z oldlog2.Z | wusage -l -

And the following pipeline will send a combination of compressed and uncompressed log files to wusage:

zcat oldlog1.Z oldlog2.Z | cat - oldlog3 oldlog4 | wusage -l -

Note that both cat and wusage are being fed information from the output of the preceding program, using the special filename -.

"What about non-Unix platforms?"

The MSDOS-related operating systems support the the type command, which can be used to output several files and pipe that output to wusage:

type oldlog1.dat oldlog2.dat oldlog3.dat | wusage -l -

"I ran wusage on an old log file, and now I have zero accesses for the last two months! What happened?"

Wusage normally creates reports through the most recent complete day or week, and will not generate those reports again. You can override this behavior using the -b and -e command line options, which are used to force wusage to start re-generating reports at an earlier date, or to stop well before the present date. If you inadvertently produce empty reports for the most recent several weeks or months, just use the -b option to specify a date from which wusage should start re-generating those reports, and specify the more recent logfile as well using the -l option.


Table of Contents Next

Copyright 1996, Boutell.Com, Inc.
wusage@boutell.com


Boutell.Com, Inc - PO Box 20837, Seattle WA, 98102, USA
Phone/Fax +1 206.325.3009