Patterns: Matching Documents and Sites

Many of the configuration file options, such as allow and totals, allow the use of patterns. Patterns are used to match several, many or all documents or sites with a single line in the configuration file.

The simplest pattern is a document name (or site name). For instance:

/directory/name.html

This pattern matches only the single document /directory/name.html. If it is listed for the ignore option, then that document will be ignored in all statistics produced by wusage.

A slightly more complex pattern uses an * character to match any number of characters. This is identical to the way both MSDOS and Unix use the * character for commands such as Unix ls and MSDOS dir. For instance:

*.gif

This pattern matches all document names which end in the letters .gif. This is useful in the ignore option, where it instructs wusage to completely ignore all accesses to GIF-format images.

Note: the suffixes option is applied first, before options such as allow and ignore. If you want to write a pattern that matches a document such as /index.html, take into account that the index.html part will be removed by the standard suffixes option and use the / by itself. For the index file of a subdirectory, the slash will also be removed to combine all accesses to that index into one document name.

The * character can appear more than once, and it can appear at any point in the pattern. (This is slightly different from the way MSDOS uses the *.)

The ? character can also be used. ? matches any one character in the document name or site name.

Finally, the | character can be used to separate distinct patterns on the same line. If a document name or site name matches any of the patterns separated by the | character on that line, it is considered to be a match for the complete pattern.


Table of Contents Next

Copyright 1996, Boutell.Com, Inc.
wusage@boutell.com


Boutell.Com, Inc - PO Box 20837, Seattle WA, 98102, USA
Phone/Fax +1 206.325.3009