Next: , Previous: mountables, Up: Command reference



1.24 processes

Using the processes facility, you can test for the existence of processes, signal (kill) processes and optionally restart them again. Cfengine opens a pipe from the system ps command and searches through the output from this command using regular expressions to match the lines of output from ps. The regular expression does not have to be an exact match, only a substring of the process line. The form of a process command is

     processes:
     
         "quoted regular expression"
     
                             restart "shell command"
                             useshell=true/false/dumb
                             owner=restart-uid
                             group=restart-gid
                             chroot=directory
                             chdir=directory
                             umask=mask
     
                             signal=signal name
                             matches=number
                             define=classlist
                             action=signal/do/warn/bymatch
                             include=literal
                             exclude=literal
                             syslog=true/on/false/off
                             inform=true/on/false/off
     
         SetOptionString "quoted option string"
     
By default, the options sent to ps are "-aux" for BSD systems and "-ef" for system 5. You can use the SetOptionString command to redefine the option string. Cfengine assumes only that the first identifiable number on each line is the process identifier for the processes, so you must not choose options for ps which change this basic requirement (this is not a problem in practice). Cfengine reads the output of the ps-command normally only once, and searches through it in memory. The process table is only re-consulted if SetOptionString is called. The options have the following meanings:

signal=signal name
This option defines the name of a signal which is to be sent to all processes matching the quoted regular expression. If this option is omitted, no signal is sent. The signal names have the usual meanings. The full list, with largely standardized meanings, is
             hup       1   hang-up
             int       2   interrupt
             quit      3   quit
             ill       4   illegal instruction
             trap      5   trace trap
             iot       6   iot instruction
             emt       7   emt instruction
             fpe       8   floating point exception
             kill      9   kill signal
             bus      10   bus error
             segv     11   segmentation fault
             sys      12   bad argument to system call
             pipe     13   write to non existent pipe
             alrm     14   alarm clock
             term     15   software termination signal
             urg      16   urgent condition on I/O channel
             stop     17   stop signal (not from tty)
             tstp     18   stop from tty
             cont     19   continue
             chld     20   to parent on child exit/stop
             gttin    21   to readers pgrp upon background tty read
             gttou    22   like TTIN for output if (tp->t_local&LTOSTOP)
             io       23   input/output possible signal
             xcpu     24   exceeded CPU time limit
             xfsz     25   exceeded file size limit
             vtalrm   26   virtual time alarm
             prof     27   profiling time alarm
             winch    28   window changed
             lost     29   resource lost (eg, record-lock lost)
             usr1     30   user defined signal 1
             usr2     31   user defined signal 2
          
     

Note that cfengine will not attempt to signal or restart processes 0 to 3 on any system since such an attempt could bring down the system. The only exception is that the hangup (hup) signal may be sent to process 1 (init) which normally forces init to reread its terminal configuration files.

restart "shell command"
Note the syntax: there is no equals sign here. If the keyword `restart' appears, then the next quoted string is interpreted as a shell command which is to be executed after any signals have been sent. This command is only issued if the number of processes matching the specified regular expression is zero, or if the signal sent was signal 9 (sigkill) or 15 (sigterm) , i.e. the normal termination signals. This could be used to restart a daemon for instance. Cfengine executes this command and waits for its completion so you should normally only use this feature to execute non-blocking commands, such as daemons which dissociate themselves from the I/O stream and place themselves in the background. Some unices leave a hanging pipe on restart (they never manage to detect the end of file condition). This occurs on POSIX.1 and SVR4 popen calls which use wait4. For some reason they fail to find and end-of-file for an exiting child process and go into a deadlock trying to read from an already dead process. This leaves a zombie behind (the parent daemon process which forked and was supposed to exit) though the child continues. A way around this is to use a wrapper script which prints the line "cfengine-die" to STDOUT after restarting the process. This causes cfengine to close the pipe forcibly and continue. Cfengine places a timeout on the restart process and attempts to clean up zombies, but you should be aware of this possibility.
owner=,group=
Sets the process uid and gid (setuid,gid) for processes which are restarted. This applies only to cfengine run by root.
chroot
Changes the process root directory of the restarted process, creating a `sandbox' which the process cannot escape from. Best used together with a change of owner, since a root process can break out of such a confinement in principle.
chdir
Change the current working directory of the restarted process.
useshell=true/false/dumb
When restarting processes, cfengine normally uses a shell to interpret and execute the restart command. This has inherent security problems associated with it. If you set this option to true, cfengine executes restart commands without using a shell. This is recommended, but it does mean that you cannot use any shell operators or features in the restart command-line.

Some programs (like cron) do not handle I/O properly when they fork their daemon parts, this causes a zombie process and normally hangs cfengine. By choosing the value `dumb' for this, cfengine ignores all output from a program and does not use a startup shell. This prevents programs like cron from hanging cfengine.

matches=number
This option may be used to set a maximum, minimum or exact number of matches. If cfengine doesn't find a number of matches to the regular expression which is in accordance with this value it signals a warning. The <, > symbols are used to specify upper and lower limits. For example,
            matches=<6  # warn number of matches is greater than or equal to 6
            matches=1   # warn if not exactly 1 matching process
            matches=>2  # warn if there are less than or equal to 2 matching processes
     


include=literal
Items listed as includes provide an extra level of selection after the regular expression matches have been expanded. If you include one include option, then only lines containing one or more of the literal strings or wildcards will be matched.
exclude=literal
Process lines containing literal strings or wildcards in exclude statements are not matched. Excludes are processed after regular expression matching and after includes.
define=classlist
The colon, command or dot separated list of classes becomes activated if the number of regular expression matches is non zero.
action=signal/do/warn
The default value of this option is to silently send a signal (if one was defined using the signal option) to matching processes. This is equivalent to setting the value of this parameter to signal or do. If you set this option to warn, cfengine sends no signal, but prints a message detailing the processes which match the regular expression. If the option is set to bymatch, then signals are only sent to the processes if the matches criteria fail.

Here is an example script which sends the hang-up signal to cron, forcing it to reread its crontab files:

     
     processes:
     
        "cron" signal=hup
     

Here is a second example which may be used to restart the nameservice on a solaris system:

     
     processes:
     
        solaris::
     
            "named" signal=kill restart "/usr/sbin/in.named"
     

A more complex match could be used to look for processes belonging to a particular user. Here is a script which kills ftp related processes belonging to a particular user who is known to spend the whole day FTP-ing files:

     
     control:
     
         actionsequence = ( processes )
     
       #
       # Set a kill signal here for convenience
       #
     
         sig = ( kill )
     
       #
       # Better not find that dumpster here!
       #
     
         matches = ( 1 )
     
     processes:
     
        #
        #  Look for Johnny Mnemonic trying to dump his head, user = jmnemon
        #
     
        ".*jmnemon.*ftp.*" signal=$(sig) matches=<$(matches) action=$(do)
     
        # No mercy!
     

The regular expression .* matches any number of characters, so this command searches for a line containing both the username and something to do with ftp and sends these processes the kill signal. Further examples may be found in the FAQ section See FAQS and Tips.

You can arrange for signals to be sent, only if the number of matches fails the test. The action=bymatch option is used for this. For instance, to kill process `XXX' only if the number of matches is greater than 20, one would write:

     
     processes:
     
     "XXX" matches=<20  action=bymatch signal=kill
     

See also filters See filters, for more complex searches.