This section should be read in conjunction with
supervisor(3)
, where all details about the supervisor
behaviour is given.
A supervisor is responsible for starting, stopping and monitoring its child processes. The basic idea of a supervisor is that it should keep its child processes alive by restarting them when necessary.
Which child processes to start and monitor is specified by a list of child specifications. The child processes are started in the order specified by this list, and terminated in the reversed order.
The callback module for a supervisor starting the server from the gen_server chapter could look like this:
-module(ch_sup). -behaviour(supervisor). -export([start_link/0]). -export([init/1]). start_link() -> supervisor:start_link(ch_sup, []). init(_Args) -> {ok, {{one_for_one, 1, 60}, [{ch3, {ch3, start_link, []}, permanent, brutal_kill, worker, [ch3]}]}}.
one_for_one
is the restart
strategy.
1 and 60 defines the maximum restart frequency.
The tuple {ch3, ...}
is a child
specification.
If a child process terminates, only that process is restarted.
If a child process terminates, all other child processes are terminated and then all child processes, including the terminated one, are restarted.
If a child process terminates, the 'rest' of the child processes -- i.e. the child processes after the terminated process in start order -- are terminated. Then the terminated child process and the rest of the child processes are restarted.
The supervisors have a built-in mechanism to limit the number of
restarts which can occur in a given time interval. This is
determined by the values of the two parameters MaxR
and
MaxT
in the start specification returned by the callback
function init
:
init(...) -> {ok, {{RestartStrategy, MaxR, MaxT}, [ChildSpec, ...]}}.
If more than MaxR
number of restarts occur in the last
MaxT
seconds, then the supervisor terminates all the child
processes and then itself.
When the supervisor terminates, then the next higher level supervisor takes some action. It either restarts the terminated supervisor, or terminates itself.
The intention of the restart mechanism is to prevent a situation where a process repeatedly dies for the same reason, only to be restarted again.
This is the type definition for a child specification:
{Id, StartFunc, Restart, Shutdown, Type, Modules} Id = term() StartFunc = {M, F, A} M = F = atom() A = [term()] Restart = permanent | transient | temporary Shutdown = brutal_kill | integer() >=0 | infinity Type = worker | supervisor Modules = [Module] | dynamic Module = atom()
Id
is a name that is used to identify the child
specification internally by the supervisor.StartFunc
defines the function call used to start
the child process. It is a module-function-arguments tuple
used as apply(M, F, A)
.supervisor:start_link
, gen_server:start_link
,
gen_fsm:start_link
or gen_event:start_link
.
(Or a function compliant with these functions, see
supervisor(3)
for details.Restart
defines when a terminated child process should
be restarted.permanent
child process is always restarted.
temporary
child process is never restarted.
transient
child process is restarted only if it
terminates abnormally, i.e. with another exit reason than
normal
.
Shutdown
defines how a child process should be
terminated.brutal_kill
means the child process is
unconditionally terminated using exit(Child, kill)
.
exit(Child, shutdown)
and then waits for an exit
signal back. If no exit signal is received within
the specified time, the child process is unconditionally
terminated using exit(Child, kill)
.
infinity
to give the subtree enough time to
shutdown.
Type
specifies if the child process is a supervisor or
a worker.Modules
should be a list with one element
[Module]
, where Module
is the name of
the callback module, if the child process is a supervisor,
gen_server or gen_fsm. If the child process is a gen_event,
Modules
should be dynamic
.Example: The child specification to start the server ch3
in the example above looks like:
{ch3, {ch3, start_link, []}, permanent, brutal_kill, worker, [ch3]}
Example: A child specification to start the event manager from the chapter about gen_event:
{error_man, {gen_event, start_link, [{local, error_man}]}, permanent, 5000, worker, dynamic}
Both the server and event manager are registered processes which
can be expected to be accessible at all times, thus they are
specified to be permanent
.
ch3
does not need to do any cleaning up before
termination, thus no shutdown time is needed but
brutal_kill
should be sufficient. error_man
may
need some time for the event handlers to clean up, thus
Shutdown
is set to 5000 ms.
Example: A child specification to start another supervisor:
{sup, {sup, start_link, []}, transient, infinity, supervisor, [sup]}
In the example above, the supervisor is started by calling
ch_sup:start_link()
:
start_link() -> supervisor:start_link(ch_sup, []).
ch_sup:start_link
calls the function
supervisor:start_link/2
. This function spawns and links to
a new process, a supervisor.
ch_sup
, is the name of
the callback module, that is the module where the init
callback function is located.
init
. Here, init
does not
need any indata and ignores the argument.
In this case, the supervisor is not registered. Instead its pid
must be used. A name can be specified by calling
supervisor:start_link({local, Name}, Module, Args)
or
supervisor:start_link({global, Name}, Module, Args)
.
The new supervisor process calls the callback function
ch_sup:init([])
. init
is expected to return
{ok, StartSpec}
:
init(_Args) -> {ok, {{one_for_one, 1, 60}, [{ch3, {ch3, start_link, []}, permanent, brutal_kill, worker, [ch3]}]}}.
The supervisor then starts all its child processes according to
the child specifications in the start specification. In this case
there is one child process, ch3
.
Note that supervisor:start_link
is synchronous. It does
not return until all child processes have been started.
In addition to the static supervision tree, we can also add dynamic child processes to an existing supervisor with the following call:
supervisor:start_child(Sup, ChildSpec)
Sup
is the pid, or name, of the supervisor.
ChildSpec
is a child
specification.
Child processes added using start_child/2
behave in
the same manner as the other child processes, with the following
important exception: If a supervisor dies and is re-created, then
all child processes which were dynamically added to the supervisor
will be lost.
Any child process, static or dynamic, can be stopped in accordance with the shutdown specification:
supervisor:terminate_child(Sup, Id)
The child specification for a stopped child process is deleted with the following call:
supervisor:delete_child(Sup, Id)
Sup
is the pid, or name, of the supervisor.
Id
is the id specified in the child
specification.
As with dynamically added child processes, the effects of deleting a static child process is lost if the supervisor itself restarts.
A supervisor with restart strategy simple_one_for_one
is
a simplified one_for_one supervisor, where all child processes are
dynamically added instances of the same process.
Example of a callback module for a simple_one_for_one supervisor:
-module(simple_sup). -behaviour(supervisor). -export([start_link/0]). -export([init/1]). start_link() -> supervisor:start_link(simple_sup, []). init(_Args) -> {ok, {{simple_one_for_one, 0, 1}, [{call, {call, start_link, []}, temporary, brutal_kill, worker, [call]}]}}.
When started, the supervisor will not start any child processes. Instead, all child processes are added dynamically by calling:
supervisor:start_child(Sup, List)
Sup
is the pid, or name, of the supervisor.
List
is an arbitrary list of terms which will be added to
the list of arguments specified in the child specification. If
the start function is specified as {M, F, A}
, then
the child process is started by calling
apply(M, F, A++List)
.
For example, adding a child to simple_sup
above:
supervisor:start_child(Pid, [id1])
results in the child process being started by calling
apply(call, start_link, []++[id1])
, or actually:
call:start_link(id1)
Since the supervisor is part of a supervision tree, it will automatically be terminated by its supervisor. When asked to shutdown, it will terminate all child processes in reversed start order according to the respective shutdown specifications, and then terminate itself.