scheduler - zmailer transport queue scheduler daemon
scheduler [ -divFHnQSVW ] [ -f configfile ] [ -E newentsmax ] [ -L logfile ] [ -l statisticslog ] [ -N transpmaxfno ] [ -P postoffice ] [ -p channel/hostpair ] [ -R maxforkfreq ] [ -q rendezvous ]
The scheduler daemon manages the delivery processing of messages in the ZMailer.
The router(8) creates message control files in the POSTOFFICE/transport directory. These refer to the original message files in the POSTOFFICE/queue directory.
The scheduler reads each message control file from POSTOFFICE/transport/, translates the contained message and destination information into internal data structures, and unlinks the message control file.
Based on scheduling, priority, and execution information read from a configuration file, the scheduler arranges to execute Transport Agents relevant to the queued messages.
At the time scheduled for a particular transport agent invocation, the scheduler will start a transport agent (or use one from idlepool), and tell it one by one which message control files to process. When all the destination addresses in a message have been processed, the scheduler performs error reporting tasks if any, and then deletes the message control file in POSTOFFICE/transport and the original message file in POSTOFFICE/queue.
All message delivery is actually performed by Transport Agents, which are declared in a configuration file for the scheduler. Each transport agent is executed with the same current directory as the scheduler. The scheduler-transporter interaction protocol is described latter in this manpage.
The standard output of each transport agent are destination address delivery reports; either successful delivery, unsuccessful delivery, or deferral of the address. Each report uses byte offsets in the message control file to refer to the address. Reports may also include a comment line which will be displayed in the scheduler's own reports.
Two types of reports are produced:
1. |
Error messages caused by unsuccessful delivery of a message are appended to its message control file. Occasionally, for example, when all addresses have been processed, the scheduler generates an error message to the error return address of the message (usually the original sender). |
|
2. |
The scheduler binds itself to a wellknown TCP/IP port (MAILQ, TCP port 174) on startup. Any connections to this port are processed synchronously in the scheduler at points in the execution where the state is internally consistent. The scheduler simply dumps its internal state in a terse format to the TCP stream. It is expected that the client program will reconstruct the data structures sufficiently to give a user a good idea of what the scheduler thinks the world looks like. The mailq(1) program serves this purpose. |
Invoking scheduler without any argument will start it as a daemon.
-d |
run as a daemon, usually used after -v to log daemon activity in great detail. |
|
-E newentsmax |
when globbing new tasks from the directory, pick only first ``newentsmax'' of them, and leave rest for a new scan run. |
|
-f configfile -F |
overrides the default configuration file MAILSHARE/scheduler.cf. Freeze don't actually run anything, just do queue scanning. (For debug purposes..) |
|
-H -HH |
Use multilevel hashing at the spool directories. This will efficiently reduce the lengths of the scans at the directories to find some arbitary file in them. One `H' means "single level hashing", two `HH's mean "dual level hashing". ``Hash'' is directory which name is single upper case alphabet (AZ). |
When existing, ZENV variable SCHEDULERDIRHASH overrides the `H' option. |
-i -L logfile -l statisticslog |
run interactively, i.e., not as a daemon. overrides the default log file location LOGDIR/scheduler. starts the appending of delivery statistics information (ASCII form) into given file. No default value. |
|
-M [1|2] |
Version of the mailq protocol this server runs; essentially a test option, as existence of PARAMauthfile=".." assignment at the scheduler.conf file turns the protocol into version 2. |
|
-N transmaxfno |
||
sets how many filehandles are allocated for the scheduler's started children (if the system has adjustable resources.) |
||
-n |
Toggles configuration flag `default_full_content', which defines what will be DSN RET parameter assumed value in case the originator didn't supply that parameter. Default behaviour is similar to RET=FULL, while usage of this option is equivalent of RET=HDRS. This option does not override originator supplied DSN RET parameter value. |
|
-p channel/host |
A debugtype option for running selectively some thread under a single instance of the scheduler. Use this with options: -v |
|
-P postoffice -q rendezvous |
specifies an alternate POSTOFFICE directory. on machines without TCP/IP networking, the rendezvous between scheduler and mailq(1) is done using a wellknown named pipe. This option overrides the default location for this special file, either RENDEZVOUS or /usr/tmp/.mailq.text. |
|
-Q |
The ``Q''mode, don't output the old style data into the queue querier, only the newstyle one. |
|
-S |
Synchronous startup mode, scans all jobs at the directory before starting even the first transporter. |
|
-v -W |
be verbose about activity, and do not detach as a daemon. be used in conjunction with -v to delay verbose logging start until after all the files have been parsed in, and it is a time for doing scheduling. |
|
-V |
print version message and run interactively. |
The scheduler configuration file consists of a set of clauses. Each clause is selected by the pattern it starts with. The patterns for the clauses are matched, in sequence, with the channel/host string for each recipient address. When a clause pattern matches an address, the parameters set in the clause will be applied to the scheduler's processing of that address. If the clause specifies a command, the clause pattern matching sequence is terminated. This is a clause:
local/* |
interval=10s |
expiry=3h |
# want 20 channel slots in case of blockage on one |
maxchannel=20 |
# want 20 threadring slots |
maxring=20 |
command="mailbox 8" |
A clause consists of:
- A selection pattern (in shell style) that is matched against the channel/host string for an address. - 0 or more variable assignments or keywords (described below). |
There are several possible PARAMassignments starting at column 0, more of them below.
If the selection pattern does not contain a '/', it is assumed to be a channel pattern and the host pattern is assumed to be the wildcard '*'.
The components of a clause are separated by whitespace. The pattern introducing a clause must start in the first column of a line, and the variable assignments or keywords inside a clause must not start in the first column of a line. This means a clause may be written both compactly all on one line, or spread out with an assignment or keyword per line.
If the clause is empty (i.e., consists only of a pattern), then the contents of the next nonempty clause will be used.
The typical configuration file will contain the following clauses:
- a clause matching all addresses (using the pattern */*) that sets up default values. - a clause matching the local delivery channel (usually local). - a clause matching the deferred delivery channel (usually hold). - a clause matching the error reporting channel (usually error). - clauses specific to the other channels known by the router, for example, smtp and uucp. |
The actual names of these channels are completely controlled by the router configuration file.
Empty lines, and lines whose first nonwhitespace character is '#', are ignored.
Variable values may be unquoted words or values or doublequoted strings. Intervals (delta time) are specified using a concatenation of: numbers suffixed with 's', 'm', 'h', or 'd' modifiers designating the number as a second, minute, hour, or day value. For example: 1h5m20s.
The known variables and keywords, and their typical values and semantics are:
interval (1m) |
specifies the primary retry interval, which determines how frequently a transport agent should be scheduled for an address. The value is a delta time specification. This value, and the retries value mentioned below, are combined to determine the interval between each retry attempt. |
|
idlemax (3x interval) |
||
When a transport agent runs out of jobs, they are moved to ``idle pool'', and it a TA spends more than idlemax time in there, it is terminated. |
||
expiry (3d) |
specifies the maximum age of an address in the scheduler queue before a repeatedly deferred address is bounced with an expiration error. The actual report is produced when all addresses have been processed. |
retries (1 1 2 3 5 8 13 21 34)
specifies the retry interval policy of the scheduler for an address. The value must be a sequence of positive integers, these being multiples of the primary interval before a retry is scheduled. The scheduler starts by going through the sequence as an address is repeatedly deferred. When the end of the sequence is reached, the scheduler will jump into the sequence at a random spot and continue towards the end. This allows various retry strategies to be specified easily: brute force (or "jackhammer"): |
etc. |
skew (5) |
Leftover from earlier scheduler internal structure. Does not make |
user (root) |
is the user id of a transport agent processing the address. The value is either numeric (a uid) or an account name. |
||
group (daemon) |
is the group id of a transport agent processing the address. The value is either numeric (a gid) or a group name. |
command (smtp srl ${LOGDIR}/smtp $host)
is the command line used to start a transport agent to process the |
address. |
||||||||
The program |
pathname |
is |
specified |
relative |
to |
the |
MAILBIN/ta |
The string "$channel" is replaced by the current matched channel, and It is strongly recommended that the $host is not to be used on a It is possible to place environmentstring setting statements into the command="MALLOC_DEBUG_=1 OTHER=var cmdname cmdparams" |
|||
queueonly |
a clause with queueonly flag does not autostart at the arrival of a To have message expiration working, following additional entries are |
For example, this is a complete configuration file:
# Default values |
*/* |
interval=1m expiry=3d retries="1 1 2 3 5 8 13 21 34" |
maxring=0 maxta=0 skew=5 user=root group=daemon |
# Boilerplate parameters for local delivery and service channels |
local/* |
||
interval=10s expiry=3h maxchannel=2 command=mailbox |
||
error |
||
interval=5m maxchannel=10 command=errormail |
||
hold/* |
||
interval=5m maxchannel=1 command=hold |
# Miscellaneous channels supported by router configuration |
smtp/*.toronto.edu |
|||
maxchannel=10 maxring=2 |
|||
smtp |
|||
maxchannel=10 maxring=5 |
|||
uucp/* |
|||
maxchannel=5 command="sm c $channel uucp" |
The first clause (*/*) sets up default values for all addresses. There is no command specification, so clause matching will continue after address have picked up the parameters set here.
The third clause (error) has an implicit host wildcard of '*', so it would match the same as specifying error/* would have.
The fifth clause (smtp/*.toronto.edu) has no further components so it selects the components of the following nonempty clause (the sixth).
Both the fifth and sixth clauses are specific to address destinations within the TORONTO.EDU and UTORONTO.CA organization (the two are parallel domains). At most 10 deliveries to the smtp channel may be concurrently active, and at most 2 for all possible hosts within TORONTO.EDU. If $host is mentioned in the command specification, the transport agent will only be told about the message control files that indicate SMTP delivery to a particular host. The actual host is picked at random from the current choices, to avoid systematic errors leading to a deadlock of any queue.
The scheduler can assign several of its internal parameters by having variable assignments beginning at column 0, and beginning with "PARAM" text:
PARAMmailqpath = "UNIX:/path/to/pf_unix/mailq/socket"
PARAMmailqpath = "TCP:mailq"
PARAMmailqpath = "TCP:174"
These define two different types of possible socket addresses for the |
mailq protocol; a UNIX socket, and a TCP socket. Default is |
PARAMautfile = "/path/to/scheduler.auth"
Location of MAILQv2 autentication control file |
PARAMglobalreportinterval = 15m
Interval by which all permanent reports accumulated into a message are reported by; sends out early reports of delivery failures, and does not force to wait for maximum queue timeout in case the message has more than once recipient. |
A message control file contains all the information needed by delivery programs like scheduler and the transport agents. It is a terse presentation of the router's decisions, along with some useful reference information.
The message control file consists of a number of fields.
All fields start in the first column (i.e., at the beginning of the file or just after a newline), and most fields extend to the end of line. The one exception is the message header field which extends till a doublenewline terminator.
For all but this message header field, the second column is reserved for a tag byte. This position is used to lock the field and to indicate the status of past processing of the field. For example, the success or failure of delivery to a recipient address is indicated by a '+' or means the field has not been processed, or that processing has been deferred. A '~' indicates the field is locked because some transport agent is currently processing delivery for the address. The known field names and tags are defined in <mail.h>.
For all the recipient addresses, there is 6 characters space for transportagent processid so that a quickly restarted scheduler will not do doubledelivery on some slowly running transporter.
The following fields are mandatory:
@ 0xHHHHHH |
Carries hexencoded bitflags of what kind of format this dataset really is in. This is to ensure that ``featurefullness'' relation of: router <= scheduler <= TAprograms is not violated, and that the messages in the spool meet this criteria too. Rolling binaries back too far might break things. |
|
i 123456-789 |
the name of the message file in the POSTOFFICE/queue directory and of the message control file in the POSTOFFICE/transport directory. the system and therefore are named by their inode number. |
|
o NNNNNNN |
the byte offset of the message body in the original message. |
The following fields will frequently exist:
e user@some.domain
is the return address for error messages, in a form that can be put in a To: header line. |
l <mumble@jumble>
is a string identifying this message in log entries. Typically the message id of the message would be used. |
The following fields will occasionally appear:
x <jumble@mumble>
is the log identification string (usually a message id) of an obsoleted message. The scheduler will purge any such identified message after running sanity checks. |
v ../public/v_filename
is the name of a file that the delivery system can appended log information to. This would appear as the result of running sendmail v or Mail v. Since all programs need to refer to the same file, on mail clusters it is recommended that this be a relative path naming a file within the POSTOFFICE directory hierarchy. |
A message control file must contain at least one address "group". Each group consists of a sender address field, one or more recipient address fields, and a message header that goes along with these.
An address field is a string containing a spaceseparated 4tuple (quad) as follows:
channel |
is the name of the delivery channel for this address. This must be a contiguous word. |
host |
is the name of the next destination host for this address. This too must be a contiguous word. |
|
user |
is the address to be handed to the destination host for further delivery. This string may contain space. It is distinguishable because the last |
component cannot contain spaces. |
||
privilege |
is the numeric uid representing the privileges associated with this address. |
The address group components are:
s <address quad> carries a sender address field in addressquad form.
r PPPPPPDDDD<address quad>
is a recipient address field in address quad form, but also contains fields for transprort agent pid number (PPPPPP), and a four character space for delay reporting by the scheduler (DDDD). |
N string |
recipient. |
||
n string |
is the deliverystatusnotification environment id data for the previous recipient. |
|
R string X nnnnnn |
The DSN RETmode setting (HDRS/FULL) for the previous recipient. is an XOR recipient address field. The first element is a tag (a class number) to identify collections of recipient addresses which are equivalent (and therefore mutually exclusive). This is followed by an address field as described above. |
|
m |
carries the message header for this address group. |
After one or more of these address groups, the error messages for addresses are appended to the message control file. This is done by the scheduler as it receives error reports from transport agents.
d <diagstring>
Is storage format for diagnostics for recipient addresses. |
Structure of the diagnostic string is: |
d id:headeroffset:drptoffset::diagtime t notarydata t message |
d 172:347:226::964917927 ... |
The id field tells the byte offset to tge ``r'' of the receiver definition line. |
The headeroffset gives the byte offset of the first byte of the headers associated with the address group. The drptoffset points to the possible ``N'' line. The diagtime is just system time of the event. The notarydata and message are explained further below, and are identical to the transportagent to the scheduler communication protocol objects of the same name. |
For example, this is a typical message control file (it is a snapshot taken while a transport agent was running):
i 15582 |
o 60 |
l <90Jun3.165355edt.15582@neat.cs.toronto.edu> |
e Rayan Zachariassen <rayan> |
s local rayan 7 |
r~23456 m |
local rayan 7 |
Received: by neat.cs.toronto.edu id <15582>; |
Sun, 3 Jun 1990 16:53:55 0400 |
From: To: |
Rayan Zachariassen <rayan> |
Subject: a typical message control file |
MessageId: <90Jun3.165355edt.15582@neat.cs.toronto.edu> Date: Sun, 3 Jun 1990 16:53:54 0400 |
The transport agent interface follows masterslave model, where the TA informs the
scheduler that it is ready for the work, and then the scheduler sends it one job description,
and awaits for diagnostics. Once the job is finished, the TA notifies the scheduler that it is
ready for a new job.
A short sample session looks like this:
S: (start the transport agent) |
T: #hungry |
S: spoolid t hostspec |
T: diagnostics |
T: #hungry |
(etc. active work) |
T: #hungry |
S: #idle |
T: #hungry |
(the scheduler moved the TA into IDLE pool) |
S: spoolid t hostspec |
(the TA was reactivated from the IDLE pool, doing work) |
T: #hungry |
S: EOF |
(the scheduler determined that the TA should be killed) |
T: (exits) |
("S" = Scheduler, "T" = Transport agent)
Normal diagnostic output is of the form:
id / offset t notarydata t status SPC message |
|||
where: |
|||
id offset |
is the inode number of the message file, is a byte offset within its control file where the address being reported on is kept, |
||
notarydata |
is a CtrlA separated quintet/sextet carrying deliverystatusnotification information for the recipient. Status is one of ok, ok2, ok3, error, error2, deferred, deferall, or retryat, and the |
||
message |
is descriptive text associated with the report. The text is terminated by a linefeed. Any other format (as might be produced by subprocesses) is passed to standard output for logging in the scheduler log. The retryat response will assume the first word of the text is a numeric parameter, either an incremental time in seconds if prefixed by +, or otherwise an absolute time in seconds since epoch. |
The exit status is a code from <sysexits.h>.
The notarydata has ControlA separated subfields, five or six of them:
FinalRcptAddress DSNAction ENHStatus ReportString WTTHost [WTTTAid] |
FinalRcptAddress This is the final form of the recipient address used at the final delivery of
which the diagnostic data is report about. Compare with ORCPT=
data!
DSNAction |
One of: "delivered", "failed", relayed", "delayed", "expanded". |
See |
||
ENHStatus ReportString WTTHost |
Enhanced status code per RFC 2034. A freeform text (one line, no e.g. CRs enbedded). For SMTP systems to produce "RemoteMTA:" header contents. |
WTTTAid |
ZMailer specific extension to report the process name and PID number of the transport agent producing this report. Helps to hunt for the syslogged data relating to this diagnostics instance. |
The statistics log reports condenced performance oriented information in following format:
where the fields are: |
||
timestamp fileid dt1 |
The original spoolfile ctime (creation time) stamp in decimal. Spoolfile name after the router has processed it The time difference from spoolfile ctime to scheduler control file creation by the router |
|
dt2 |
The time difference from scheduler file ctime to the delivery that is logged on |
|
state $channel/$host |
What happened? Values: ok, error, expiry Where/how it was processed |
Upon accepting a TCP connection on the MAILQ port (TCP port 174), the scheduler dumps data to the TCP stream in the following format and immediately closes the connection:
The TCP stream syntax is:
version idn |
data in iddependent format<close> |
The first line (all bytes up to an ASCII LF character, octal 12) is used to identify the syntax of
all bytes following the line terminator LF. The first 8 characters of the first line are "version "
as a check that this is indeed a MAILQ port server that has been reached, the remaining
bytes are the real data format identification. The data is interpreted according to that format
until the terminating connection close.
Format identifiers should be registered with the author. The only one currently defined is
"zmailer 1.0". For that data format, the syntax of the data following the first LF is:
Vertices:n |
(<key>:t><msgfile>t><naddrs>; <off1>(,<offN>)*t>[#<text>]n)* |
(Channels:n |
(<word>:t>(><key>)+n)+ |
Hosts:n |
(<word>:t>(><key>)+n)+)? |
End:n |
<mailq Q thread and status report> |
Where:
For example, here is sample output from connecting to the MAILQ port:
version zmailer 1.0 |
Vertices: |
311424: 3714 1; 116 311680: 6472 2; 151,331 #128.100.8.4: Null read! (will retry) 312192: 6347 1; 152 #128.89.0.93: connect: Connection timed out (will retry) Channels: smtp: >311424>311680>312192 Hosts: scg.toronto.edu: >311424 mv04.ecf.toronto.edu: >311680 relay1.cs.net: >312192 |
This is sufficient information to be able to reconstruct the transport queues as seen by the
scheduler process, and to find more information than what is shown here by actually looking
up the message control and data files referred to.
The MAILQv2 protocol is interactive autenticating protocol, unlike its predecessor (v1).
The system begins with a greeting telling version, and then giving one line of challenge to be
used in subsequent autentication command:
version zmailer 2.0n |
MAILQV2CHALLENGE: 942665308.906504.3n |
Protocol commands are:
AUTH username hexauthenticator
The "login" of the mailq session. The hexauthenticator is lowercase |
hexadecimal character printout of MD5 checksum ran over the catenate of the challenge string (without its ending newline character), and the user's password. This algorithm is essentially the same what APOP scheme uses. |
PERLish example with above challenge: |
$authen = MD5hex("MAILQV2CHALLENGE: 942665308.906504.3"."mypasswd") SHOW SNMP Implements `mailq -QQQ'.
SHOW QUEUE SHORT
Implements `mailq -QQ'. |
SHOW QUEUE THREADS
Implements `mailq -Q'. |
SHOW THREAD channel host
Reports details usable to implement mailqv1 like interface. The details |
ETRN etrn_string Supports ETRNcluster subsystem at smtpserver.
KILL MSG spoolid Unimplemented.
KILL THREAD channel host
Unimplemented. |
Responses are written out to same socket in POPlike manner:
AUTH .... n |
+OK or LOGIN FAILED n |
SHOW SNMPn |
+OK until LF.LFn |
textn |
If the output text contains a dot at the beginning of the line, it is duplicated in SMTP (and
POP) style.
Of various commands, the "SHOW" class implements multiple textline outputs, others only
"+OK" (or "ERR...").
For autenticating MAILQv2 protocol users, system can use
PARAMauthfile="/path/to/file.auth" PARAMassignment to identify file containing the
data, and with the file to authenticate and parametrize what user can do thru the MAILQv2
port.
#
# APOPlike authentication control file for the ZMailer scheduler.
#
# Fields are doublecolon (':') separated, and are:
# |
- Username |
#
# The default-account for 'mailq' is 'nobody' with password 'nobody'. # Third field is at the moment a WORK IN PROGRESS!
#
# SECURITY NOTE:
# |
OWNER: |
root |
# |
ALL |
well, a wild-card enabling everything |
# |
"nobody" from anywhere else. |
nobody:nobody:SNMP ETRN: |
[0.0.0.0]/0 |
[ipv6.0::0]/0 |
LOGDIR MAILBIN |
defines location of log files. Example: LOGDIR=/var/log/mail Defines where executable transportagent binaries exist under $MAILBIN/ta/ directory. |
|||
MAILSHARE PATH |
Location of scheduler configuration files What PATH environment variable to give to transportagent subprograms. |
|||
POSTOFFICE |
defines directory where all POSTOFFICE functions are under. Example: POSTOFFICE=/var/spool/postoffice |
SCHEDULERDIRHASH
Carries a numeric value of ``1'' or ``2'' (if defined at all), which will then |
override possible ``H'' option. |
|||
SYSLOGFLG |
Existence of ``c'' or ``C'' character in value string enables syslogging of some events as seen by the scheduler. |
||
ZCONFIG |
Gives location of zmailer.conf |
If the ZMailer system is configured with tcp-wrapper code, then serviceid "mailq" is
looked for all those addresses that are allowed to do queries.
Usually files hosts.allow, and hosts.deny contain following kind of entries:
/ETC/hosts.allow |
mailq : ALL@1.2.3.0 |
smtp-receive: ALL@ALL |
/ETC/hosts.deny |
(Do note that smtpserver(8) has also tcp-wrapper support, which becomes active
simultaneously with scheduler's tcp-wrapper code!)
SIGHUP: SIGTERM: SIGQUIT: |
close and reopen the stdout/stderr log file. exit cleanly. exit cleanly, but at first order transporter childs to shut down, and collect their status reports. |
||
SIGALRM: SIGUSR1: SIGUSR2: |
check pending work. reread the scheduler configuration file. dump state information to the rendezvous file. |
router(8), mailq(1)
AUTHOR
This program authored and copyright by:
Rayan Zachariassen <rayan@cs.toronto.edu>
A plenty of changes and several real bugfixes by:
Matti Aarnio <mea@nic.funet.fi>