SLURM Job Accounting Plugin API
Overview
This document describes SLURM job accounting plugins and the API that defines them. It is intended as a resource to programmers wishing to write their own SLURM job accounting plugins. This is version 1 of the API.
SLURM job accounting plugins must conform to the SLURM Plugin API with the following specifications:
const char
plugin_name[]="full text name"
A free-formatted ASCII text string that identifies the plugin.
const char
plugin_type[]="major/minor"
The major type must be "jobacct."
The minor type can be any suitable name
for the type of accounting package. We currently use
The programmer is urged to study
src/plugins/jobacct/linux and
src/plugins/jobacct/common
for a sample implementation of a SLURM job accounting plugin.
All of the following functions are required. Functions which are not
implemented must be stubbed.
int jobacct_p_startpoll(int frequency)
Description:
jobacct_p_startpoll() is called at the start of the slurmstepd,
this starts a thread that should poll information to be queried at any time
during throughout the end of the process.
Put global initialization here.
Arguments:
frequency (input) poll frequency for polling
thread.
Returns:
SLURM_SUCCESS on success, or
SLURM_FAILURE on failure.
int jobacct_p_endpoll()
Description:
jobacct_p_endpoll() is called when the process is finished to stop the
polling thread.
Arguments:
none
Returns:
SLURM_SUCCESS on success, or
SLURM_FAILURE on failure.
void jobacct_p_suspend_poll()
Description:
jobacct_p_suspend_poll() is called when the process is suspended.
This causes the polling thread to halt until the process is resumed.
Arguments:
none
Returns:
none
void jobacct_p_resume_poll()
Description:
jobacct_p_resume_poll() is called when the process is resumed.
This causes the polling thread to resume operation.
Arguments:
none
Returns:
none
int jobacct_p_add_task(pid_t pid, uint16_t tid)
Description:
jobacct_p_add_task() used to add a task to the poller.
Arguments:
pid (input) Process id
tid (input) slurm global task id
Returns:
SLURM_SUCCESS on success, or
SLURM_FAILURE on failure.
jobacctinfo_t *jobacct_p_stat_task(pid_t pid)
Description:
jobacct_p_stat_task() used to get most recent information about task.
You need to FREE the information returned by this function!
Arguments:
pid (input) Process id
Returns:
jobacctinfo structure pointer on success, or
NULL on failure.
jobacctinfo_t *jobacct_p_remove_task(pid_t pid)
Description:
jobacct_p_remove_task() used to remove a task from the poller.
You need to FREE the information returned by this function!
Arguments:
pid (input) Process id
Returns:
Pointer to removed jobacctinfo_t structure
on success, or
NULL on failure.
int jobacct_p_init_slurmctld(char *job_acct_log)
Description:
jobacct_p_init_slurmctld() is called at the start of the slurmctld,
this opens the logfile to be written to.
Put global initialization here.
Arguments:
job_acct_log (input) logfile name.
Returns:
SLURM_SUCCESS on success, or
SLURM_FAILURE on failure.
int jobacct_p_fini_slurmctld()
Description:
jobacct_p_fini_slurmctld() is called at the end of the slurmctld,
this closes the logfile.
Arguments:
none
Returns:
SLURM_SUCCESS on success, or
SLURM_FAILURE on failure.
int jobacct_p_job_start_slurmctld(struct job_record *job_ptr)
Description:
jobacct_p_job_start_slurmctld() is called at the allocation of a new job in
the slurmctld, this prints out beginning information about a job.
Arguments:
job_ptr (input) information about the job in
slurmctld.
Returns:
SLURM_SUCCESS on success, or
SLURM_FAILURE on failure.
int jobacct_p_job_complete_slurmctld(struct job_record *job_ptr)
Description:
jobacct_p_job_complete_slurmctld() is called at the end of a job in
the slurmctld, this prints out ending information about a job.
Arguments:
job_ptr (input) information about the job in
slurmctld.
Returns:
SLURM_SUCCESS on success, or
SLURM_FAILURE on failure.
int jobacct_p_step_start_slurmctld(struct step_record *step_ptr)
Description:
jobacct_p_step_start_slurmctld() is called at the allocation of a new step in
the slurmctld, this prints out beginning information about a step.
Arguments:
step_ptr (input) information about the step in
slurmctld.
Returns:
SLURM_SUCCESS on success, or
SLURM_FAILURE on failure.
int jobacct_p_step_complete_slurmctld(struct step_record *step_ptr)
Description:
jobacct_p_step_complete_slurmctld() is called at the end of a step in
the slurmctld, this prints out ending information about a step.
Arguments:
step_ptr (input) information about the step in
slurmctld.
Returns:
SLURM_SUCCESS on success, or
SLURM_FAILURE on failure.
int jobacct_p_suspend_slurmctld(struct job_record *job_ptr)
Description:
jobacct_p_suspend_slurmctld() is called when a job is suspended or resumed in
the slurmctld, this prints out information about the suspension of the job
to the logfile.
Arguments:
job_ptr (input) information about the job in
slurmctld.
Returns:
SLURM_SUCCESS on success, or
SLURM_FAILURE on failure.
int jobacct_p_init_struct(jobacctinfo_t *jobacct, uint16_t tid)
Description:
jobacct_p_init_struct() is called to set the values of a jobacctinfo_t to
initial values.
Arguments:
jobacct
(input/output) structure to be altered.
tid
(input) id of the task send in (uint16_t)NO_VAL if no specfic task.
Returns:
SLURM_SUCCESS on success, or
SLURM_FAILURE on failure.
jobacctinfo_t *jobacct_p_alloc(uint16_t tid)
Description:
jobacct_p_alloc() used to alloc a pointer to and initialize a
new jobacctinfo structure. Arguments:
tid
(input) id of the task send in (uint16_t)NO_VAL if no specfic task.
Returns:
jobacctinfo structure pointer on success, or
NULL on failure.
void jobacct_p_free(jobacctinfo_t *jobacct)
Description:
jobacct_p_free() used to free the allocation made by jobacct_p_alloc().
Arguments:
jobacct
(input) structure to be freed.
none
Returns:
none
int jobacct_p_setinfo(jobacctinfo_t *jobacct,
enum jobacct_data_type type, void *data)
Description:
jobacct_p_setinfo() is called to set the values of a jobacctinfo_t to
specific values based on inputs.
Arguments:
jobacct
(input/output) structure to be altered.
type
(input) enum of specific part of jobacct to alter.
data
(input) corresponding data to set jobacct part to.
Returns:
SLURM_SUCCESS on success, or
SLURM_FAILURE on failure.
int jobacct_p_getinfo(jobacctinfo_t *jobacct,
enum jobacct_data_type type, void *data)
Description:
jobacct_p_getinfo() is called to get the values of a jobacctinfo_t
specific values based on inputs.
Arguments:
jobacct
(input) structure to be queried.
type
(input) enum of specific part of jobacct to get.
data
(output) corresponding data to from jobacct part.
Returns:
SLURM_SUCCESS on success, or
SLURM_FAILURE on failure.
void jobacct_p_aggregate(jobacctinfo_t *dest, jobacctinfo_t *from)
Description:
jobacct_p_aggregate() is called to aggregate and get max values from two
different jobacctinfo structures.
Arguments:
dest
(input/output) initial structure to be applied to.
from
(input) new info to apply to dest.
Returns:
none
void jobacct_p_2_sacct(sacct_t *sacct, jobacctinfo_t *jobacct)
Description:
jobacct_p_2_sacct() is called to transfer information from data structure
jobacct to structure sacct.
Arguments:
sacct
(input/output) initial structure to be applied to.
jobacct
(input) jobacctinfo_t structure containing information to apply to sacct.
Returns:
none
void jobacct_p_pack(jobacctinfo_t *jobacct, Buf buffer)
Description:
jobacct_p_pack() pack jobacctinfo_t in a buffer to send across the network.
Arguments:
jobacct
(input) structure to pack.
buffer
(input/output) buffer to pack structure into.
Returns:
none
void jobacct_p_unpack(jobacctinfo_t *jobacct, Buf buffer)
Description:
jobacct_p_unpack() unpack jobacctinfo_t from a buffer received from
the network.
You will need to free the jobacctinfo_t returned by this function!
Arguments:
jobacct
(input/output) structure to fill.
buffer
(input) buffer to unpack structure from.
Returns:
SLURM_SUCCESS on success, or
SLURM_FAILURE on failure.
Rather than proliferate slurm.conf parameters for new or evolved
plugins, the job accounting API counts on three parameters:
This document describes version 1 of the SLURM Job Accounting API. Future
releases of SLURM may revise this API. A job accounting plugin conveys its
ability to implement a particular API version using the mechanism outlined
for SLURM plugins.
Last modified 31 January 2007
The sacct program can be used to display gathered data from regular
accounting and from these plugins.
API Functions
The job accounting API uses hooks in the slurmctld, slurmd, and slurmstepd.
Functions called by all slurmstepd processes
Functions called by the slurmctld process
Functions common to all processes
You will need to free the information returned by this function!
Parameters
Versioning