SLURM Accounting Storage Plugin API
Overview
This document describes SLURM Accounting Storage plugins and the API that defines them. It is intended as a resource to programmers wishing to write their own SLURM Job Accounting Storage plugins. This is version 1 of the API.
SLURM Accounting Storage plugins must conform to the SLURM Plugin API with the following specifications:
const char
plugin_name[]="full text name"
A free-formatted ASCII text string that identifies the plugin.
const char
plugin_type[]="major/minor"
The major type must be "accounting_storage."
The minor type can be any suitable name
for the type of accounting package. We currently use
The programmer is urged to study
src/plugins/accounting_storage/mysql
for a sample implementation of a SLURM Accounting Storage plugin.
The Accounting Storage plugin was written to be a interface
to storage data collected by the Job Accounting Gather plugin. When
adding a new database you may want to add common functions in a common
file in the src/database dir. Refer to src/database/mysql_common.c|.h for an
example so other plugins can also use that database type to write out
information.
All of the following functions are required. Functions which are not
implemented must be stubbed.
void *acct_storage_p_get_connection(bool
make_agent, int conn_num, bool rollback, char *location)
Description: Arguments: Returns: int acct_storage_p_close_connection(void **db_conn)
Description: Arguments: Returns: int acct_storage_p_commit(void *db_conn, bool commit)
Description: Arguments: Returns:
int acct_storage_p_add_users(void *db_conn, uint32_t uid, List user_list)
Description: Arguments: Returns:
int acct_storage_p_add_coord(void *db_conn, uint32_t uid, List acct_list, acct_user_cond_t *user_cond)
Description: Arguments: Returns:
int acct_storage_p_add_accts(void *db_conn, uint32_t uid, List acct_list)
Description: Arguments: Returns:
int acct_storage_p_add_clusters(void *db_conn, uint32_t uid, List cluster_list)
Description: Arguments: Returns:
int acct_storage_p_add_associations(void *db_conn, uint32_t uid, List association_list)
Description: Arguments: Returns:
int acct_storage_p_add_qos(void *db_conn, uint32_t uid, List qos_list)
Description: Arguments: Returns:
int acct_storage_p_add_wckeys(void *db_conn, uint32_t uid, List wckey_list)
Description: Arguments: Returns:
List acct_storage_p_modify_users(void *db_conn, uint32_t uid,
acct_user_cond_t *user_cond, acct_user_rec_t *user)
Description: Arguments: Returns:
List acct_storage_p_modify_accounts(void *db_conn, uint32_t uid,
acct_account_cond_t *acct_cond, acct_account_rec_t *acct)
Description: Arguments: Returns:
List acct_storage_p_modify_clusters(void *db_conn, uint32_t uid,
acct_cluster_cond_t *cluster_cond, acct_cluster_rec_t *cluster)
Description: Arguments: Returns:
List acct_storage_p_modify_associations(void *db_conn, uint32_t uid,
acct_association_cond_t *assoc_cond, acct_association_rec_t *assoc)
Description: Arguments: Returns:
List acct_storage_p_modify_qos(void *db_conn, uint32_t uid,
acct_qos_cond_t *qos_cond, acct_qos_rec_t *qos)
Description: Arguments: Returns:
List acct_storage_p_modify_wckeys(void *db_conn, uint32_t uid,
acct_wckey_cond_t *wckey_cond, acct_wckey_rec_t *wckey)
Description: Arguments: Returns:
List acct_storage_p_remove_users(void *db_conn, uint32_t uid,
acct_user_cond_t *user_cond)
Description: Arguments: Returns:
List acct_storage_p_remove_coord(void *db_conn, uint32_t uid,
List acct_list, acct_user_cond_t *user_cond)
Description: Arguments: Returns:
List acct_storage_p_remove_accounts(void *db_conn, uint32_t uid,
acct_account_cond_t *acct_cond)
Description: Arguments: Returns:
List acct_storage_p_remove_clusters(void *db_conn, uint32_t uid,
acct_cluster_cond_t *cluster_cond)
Description: Arguments: Returns:
List acct_storage_p_remove_associations(void *db_conn, uint32_t uid,
acct_association_cond_t *assoc_cond)
Description: Arguments: Returns:
List acct_storage_p_remove_qos(void *db_conn, uint32_t uid,
acct_qos_cond_t *qos_cond)
Description: Arguments: Returns:
List acct_storage_p_remove_wckeys(void *db_conn, uint32_t uid,
acct_wckey_cond_t *wckey_cond)
Description: Arguments: Returns:
List acct_storage_p_get_users(void *db_conn, uint32_t uid,
acct_user_cond_t *user_cond)
Description: Arguments: Returns:
List acct_storage_p_get_accts(void *db_conn, uint32_t uid,
acct_account_cond_t *acct_cond)
Description: Arguments: Returns:
List acct_storage_p_get_clusters(void *db_conn, uint32_t uid,
acct_cluster_cond_t *cluster_cond)
Description: Arguments: Returns:
List acct_storage_p_get_associations(void *db_conn, uint32_t uid,
acct_association_cond_t *assoc_cond)
Description: Arguments: Returns:
List acct_storage_p_get_qos(void *db_conn, uint32_t uid,
acct_qos_cond_t *qos_cond)
Description: Arguments: Returns:
List acct_storage_p_get_wckeys(void *db_conn, uint32_t uid,
acct_wckey_cond_t *wckey_cond)
Description: Arguments: Returns:
List acct_storage_p_get_txn(void *db_conn, uint32_t uid,
acct_txn_cond_t *txn_cond)
Description: Arguments: Returns:
int acct_storage_p_get_usage(void *db_conn, uint32_t uid, void *in, int type,
time_t start, time_t end)
Description: Arguments: Returns:
int acct_storage_p_roll_usage(void *db_conn, time_t sent_start)
Description: Arguments: Returns:
int clusteracct_storage_p_node_down(void *db_conn, char *cluster,
struct node_record *node_ptr, time_t event_time, char *reason)
Description: Arguments: Returns:
int clusteracct_storage_p_node_up(void *db_conn, char *cluster,
struct node_record *node_ptr, time_t event_time)
Description: Arguments: Returns:
int clusteracct_storage_p_cluster_procs(void *db_conn, char *cluster,
uint32_t procs, time_t event_time)
Description: Arguments: Returns:
int clusteracct_storage_p_get_usage(void *db_conn, uint32_t uid, void
*cluster_rec, int type, time_t start, time_t end)
Description: Arguments: Arguments: Returns:
int clusteracct_storage_p_register_ctld(void *db_conn, char *cluster,
uint16_t port)
Description: Arguments: Returns:
int jobacct_storage_p_job_start(void *db_conn, struct job_record *job_ptr)
Description: Arguments: Returns:
int jobacct_storage_p_job_complete(void *db_conn, struct job_record *job_ptr)
Description: Arguments: Returns:
int jobacct_storage_p_step_start(void *db_conn, struct step_record *step_ptr)
Description: Arguments: Returns:
int jobacct_storage_p_step_complete(void *db_conn, struct step_record *step_ptr)
Description: Arguments: Returns:
int jobacct_storage_p_job_suspend(void *db_conn, struct job_record *job_ptr)
Description: Arguments: Returns:
List jobacct_storage_p_get_jobs_cond(void *db_conn, uint32_t uid,
acct_job_cond_t *job_cond)
Description: Arguments: Returns:
int jobacct_storage_p_archive(void *db_conn, acct_archive_cond_t *arch_cond)
Description: Arguments: Returns:
int jobacct_storage_p_archive_load(void *db_conn, acct_archive_rect *arch_rec)
Description: Arguments: Returns:
int acct_storage_p_update_shares_used(void *db_conn, List acct_list)
Description: Arguments: Returns:
int acct_storage_p_flush_jobs_on_cluster(void *db_conn, char *cluster, time_t event_time)
Description: Arguments: Returns: These parameters can be used in the slurm.conf to set up
connections to the database all have defaults based on the plugin type
used.
This document describes version 1 of the SLURM Accounting Storage API. Future
releases of SLURM may revise this API. An Accounting Storage plugin conveys its
ability to implement a particular API version using the mechanism outlined
for SLURM plugins.
Last modified 10 February 2009
API Functions
The Job Accounting Storage API uses hooks in the slurmctld.
Functions called by the accounting_storage plugin
acct_storage_p_get_connection() is called to get a connection to the
storage medium. acct_storage_p_close_connection() should be used to
free the pointer returned by this function.
make_agent (input) to make an agent
thread of not. This is primarily used in the slurmdbd plugin.
conn_num (input) connection number to
the plugin. In many cases you should plan on multiple simultanious
connections to the plugin. This number is useful since the debug
messages can print this out to determine which connection the message
is from.
rollback (input) Allow rollback to
happen or not (in use with databases that support rollback).
void * which is an opaque structure
used inside the plugin to connection to the storage type on success, or
NULL on failure.
acct_storage_p_close_connection() is called at the end of the program that has
called acct_storage_p_get_connection this function closes the connection to
the storage type.
db_conn (input/output) connection to
the storage type, all memory will be freed inside this function and
NULLed out.
SLURM_SUCCESS on success, or
SLURM_ERROR on failure.
acct_storage_p_commit() is called at a point where you would either
want changes to storage be committed or rolled back. This function
should also send appropriate update messages to the various slurmctlds.
db_conn (input) connection to
the storage type.
commit (input) true for commit, false
to rollback if connection was set up to rollback.
SLURM_SUCCESS on success, or
SLURM_ERROR on failure.
Called to add users to the storage type.
db_conn (input) connection to
the storage type.
uid (input) uid of user calling the function.
user_list (input) list of
acct_user_rec_t *'s containing information about the users being added.
SLURM_SUCCESS on success, or
SLURM_ERROR on failure.
Called to link specified users to the specified accounts as coordinators.
db_conn (input) connection to
the storage type.
uid (input) uid of user calling the function.
acct_list (input) list of
acct_account_rec_t *'s containing information about the accounts to
add the coordinators to.
user_cond (input) contain a list of
users to add to be coordinators of the acct_list.
SLURM_SUCCESS on success, or
SLURM_ERROR on failure.
Called to add accounts to the storage type.
db_conn (input) connection to
the storage type.
uid (input) uid of user calling the function.
acct_list (input) list of
acct_account_rec_t *'s containing information about the accounts to add.
SLURM_SUCCESS on success, or
SLURM_ERROR on failure.
Called to add clusters to the storage type.
db_conn (input) connection to
the storage type.
uid (input) uid of user calling the function.
cluster_list (input) list of
acct_cluster_rec_t *'s containing information about the clusters to add.
SLURM_SUCCESS on success, or
SLURM_ERROR on failure.
Called to add associations to the storage type.
db_conn (input) connection to
the storage type.
uid (input) uid of user calling the function.
association_list (input) list of
acct_association_rec_t *'s containing information about the
associations to add.
SLURM_SUCCESS on success, or
SLURM_ERROR on failure.
Called to add QOS' to the storage type.
db_conn (input) connection to
the storage type.
uid (input) uid of user calling the function.
qos_list (input) list of
acct_qos_rec_t *'s containing information about the qos to add.
SLURM_SUCCESS on success, or
SLURM_ERROR on failure.
Called to add wckeys to the storage type.
db_conn (input) connection to
the storage type.
uid (input) uid of user calling the function.
wckey_list (input) list of
acct_wckey_rec_t *'s containing information about the wckeys to add.
SLURM_SUCCESS on success, or
SLURM_ERROR on failure.
Used to modify existing users in the storage type. The condition
could include very vaque information about the user, so this
function should be robust in the ability to give everything the user
is asking for. This is the reason a list of modified users is
returned so the caller knows what has been changed, sometimes by mistake.
db_conn (input) connection to
the storage type.
uid (input) uid of user calling the function.
user_cond (input) conditional about
which users need to change. User names or ids should not need to be stated.
user (input) what the changes
should be on the users identified by the conditional.
List containing names of users
modified on success, or
NULL on failure.
Used to modify existing accounts in the storage type. The condition
could include very vaque information about the account, so this
function should be robust in the ability to give everything the account
is asking for. This is the reason a list of modified accounts is
returned so the caller knows what has been changed, sometimes by mistake.
db_conn (input) connection to
the storage type.
uid (input) uid of user calling the function.
acct_cond (input) conditional about
which accounts need to change. Account names should not need to be stated.
acct (input) what the changes
should be on the accounts identified by the conditional.
List containing names of users
modified on success, or
NULL on failure.
Used to modify existing clusters in the storage type. The condition
could include very vaque information about the cluster, so this
function should be robust in the ability to give everything the cluster
is asking for. This is the reason a list of modified clusters is
returned so the caller knows what has been changed, sometimes by mistake.
db_conn (input) connection to
the storage type.
uid (input) uid of user calling the function.
cluster_cond (input) conditional about
which clusters need to change. Cluster names should not need to be stated.
cluster (input) what the changes
should be on the clusters identified by the conditional.
List containing names of clusters
modified on success, or
NULL on failure.
Used to modify existing associations in the storage type. The condition
could include very vaque information about the association, so this
function should be robust in the ability to give everything the association
is asking for. This is the reason a list of modified associations is
returned so the caller knows what has been changed, sometimes by mistake.
db_conn (input) connection to
the storage type.
uid (input) uid of user calling the function.
assoc_cond (input) conditional about
which associations need to change. Association ids should not need to be stated.
assoc (input) what the changes
should be on the associations identified by the conditional.
List containing names of associations
modified on success, or
NULL on failure.
Used to modify existing qos in the storage type. The condition
could include very vaque information about the qos, so this
function should be robust in the ability to give everything the qos
is asking for. This is the reason a list of modified qos is
returned so the caller knows what has been changed, sometimes by mistake.
db_conn (input) connection to
the storage type.
uid (input) uid of user calling the function.
qos_cond (input) conditional about
which qos need to change. Qos names should not need to be stated.
qos (input) what the changes
should be on the qos identified by the conditional.
List containing names of qos
modified on success, or
NULL on failure.
Used to modify existing wckeys in the storage type. The condition
could include very vaque information about the wckeys, so this
function should be robust in the ability to give everything the wckey
is asking for. This is the reason a list of modified wckey is
returned so the caller knows what has been changed, sometimes by mistake.
db_conn (input) connection to
the storage type.
uid (input) uid of user calling the function.
wckey_cond (input) conditional about
which wckeys need to change. Wckey names should not need to be stated.
wckey (input) what the changes
should be on the wckey identified by the conditional.
List containing names of wckeys
modified on success, or
NULL on failure.
Used to remove users from the storage type. This will remove all
associations. Must check to make sure all running jobs are finished
before this is allowed to execute.
db_conn (input) connection to
the storage type.
uid (input) uid of user calling the function.
user_cond (input) conditional about
which users to be removed. User names or ids should not need to be stated.
List containing names of users
removed on success, or
NULL on failure.
Used to remove coordinators from the storage type.
db_conn (input) connection to
the storage type.
uid (input) uid of user calling the function.
acct_list (input) list of accounts
associated with the users.
user_cond (input) conditional about
which users to be removed as coordinators. User names or ids should be stated.
List containing names of users
removed as coordinators on success, or
NULL on failure.
Used to remove accounts from the storage type. This will remove all
associations from these accounts. You need to make sure no jobs are
running with any association that is to be removed. If any of these
accounts are default accounts for users that must also change before
an account can be removed.
db_conn (input) connection to
the storage type.
uid (input) uid of user calling the function.
acct_cond (input) conditional about
which accounts to be removed. Account names should not need to be stated.
List containing names of accounts
removed on success, or
NULL on failure.
Used to remove clusters from the storage type. This will remove all
associations from these clusters. You need to make sure no jobs are
running with any association that is to be removed.
db_conn (input) connection to
the storage type.
uid (input) uid of user calling the function.
cluster_cond (input) conditional about
which clusters to be removed. Cluster names should not need to be stated.
List containing names of clusters
removed on success, or
NULL on failure.
Used to remove associations from the storage type. You need to make
sure no jobs are running with any association that is to be removed.
db_conn (input) connection to
the storage type.
uid (input) uid of user calling the function.
assoc_cond (input) conditional about
which associations to be removed. Association ids should not need to be stated.
List containing names of associations
removed on success, or
NULL on failure.
Used to remove qos from the storage type.
db_conn (input) connection to
the storage type.
uid (input) uid of user calling the function.
qos_cond (input) conditional about
which qos to be removed. Qos names should not need to be stated.
List containing names of qos
removed on success, or
NULL on failure.
Used to remove wckeys from the storage type.
db_conn (input) connection to
the storage type.
uid (input) uid of user calling the function.
wckey_cond (input) conditional about
which wckeys to be removed. Wckey names should not need to be stated.
List containing names of wckeys
removed on success, or
NULL on failure.
Get a list of acct_user_rec_t *'s based on the conditional sent.
db_conn (input) connection to
the storage type.
uid (input) uid of user calling the
function.
user_cond (input) conditional about
which users are to be returned. User names or ids should not need to
be stated.
List containing acct_user_rec_t *'s
on success, or
NULL on failure.
Get a list of acct_account_rec_t *'s based on the conditional sent.
db_conn (input) connection to
the storage type.
uid (input) uid of user calling the
function.
acct_cond (input) conditional about
which accounts are to be returned. Account names should not need to
be stated.
List containing acct_account_rec_t *'s
on success, or
NULL on failure.
Get a list of acct_cluster_rec_t *'s based on the conditional sent.
db_conn (input) connection to
the storage type.
uid (input) uid of user calling the
function.
cluster_cond (input) conditional about
which clusters are to be returned. Cluster names should not need to
be stated.
List containing acct_cluster_rec_t *'s
on success, or
NULL on failure.
Get a list of acct_association_rec_t *'s based on the conditional sent.
db_conn (input) connection to
the storage type.
uid (input) uid of user calling the
function.
assoc_cond (input) conditional about
which associations are to be returned. Association names should not need to
be stated.
List containing acct_association_rec_t *'s
on success, or
NULL on failure.
Get a list of acct_qos_rec_t *'s based on the conditional sent.
db_conn (input) connection to
the storage type.
uid (input) uid of user calling the
function.
qos_cond (input) conditional about
which qos are to be returned. Qos names should not need to
be stated.
List containing acct_qos_rec_t *'s
on success, or
NULL on failure.
Get a list of acct_wckey_rec_t *'s based on the conditional sent.
db_conn (input) connection to
the storage type.
uid (input) uid of user calling the
function.
wckey_cond (input) conditional about
which wckeys are to be returned. Wckey names should not need to
be stated.
List containing acct_wckey_rec_t *'s
on success, or
NULL on failure.
Get a list of acct_txn_rec_t *'s (transactions) based on the conditional sent.
db_conn (input) connection to
the storage type.
uid (input) uid of user calling the
function.
txn_cond (input) conditional about
which transactions are to be returned. Transaction ids should not need to
be stated.
List containing acct_txn_rec_t *'s
on success, or
NULL on failure.
Get usage for a specific association or wckey.
db_conn (input) connection to
the storage type.
uid (input) uid of user calling the
function.
in (input/out) can be anything that
gathers usage like acct_associaiton_rec_t * or acct_wckey_rec_t *.
type (input) really
slurmdbd_msg_type_t should let the plugin know what the structure is
that was sent in some how.
start (input) start time of the usage.
end (input) end time of the usage.
SLURM_SUCCESS on success, or
SLURM_ERROR on failure.
roll up association, cluster, and wckey usage in the storage.
db_conn (input) connection to
the storage type.
start (input) start time of the rollup.
SLURM_SUCCESS on success, or
SLURM_ERROR on failure.
Mark nodes down in the storage type.
db_conn (input) connection to
the storage type.
cluster (input) name of cluster node
is on.
node_ptr (input) pointer to the node
structure marked down.
event_time (input) time event happened.
reason (input) if different from what
is set in the node_ptr, the reason the node is down.
SLURM_SUCCESS on success, or
SLURM_ERROR on failure.
Mark nodes up in the storage type.
db_conn (input) connection to
the storage type.
cluster (input) name of cluster node
is on.
node_ptr (input) pointer to the node
structure marked up.
event_time (input) time event happened.
SLURM_SUCCESS on success, or
SLURM_ERROR on failure.
Update storage type with the current number of processors on a given cluster.
db_conn (input) connection to
the storage type.
cluster (input) name of cluster.
procs (input) number of processors on
system.
event_time (input) time event happened.
SLURM_SUCCESS on success, or
SLURM_ERROR on failure.
Get usage for a specific cluster.
db_conn (input) connection to
the storage type.
uid (input) uid of user calling the
function.
cluster_rec (input/out)
acct_cluster_rec_t * already set with the cluster name. Usage will be
filled in.
type (input) really
slurmdbd_msg_type_t should let the plugin know what the structure is
that was sent in some how for this it is just DBD_GET_CLUSTER_USAGE.
start (input) start time of the usage.
end (input) end time of the usage.
db_conn (input) connection to
the storage type.
SLURM_SUCCESS on success, or
SLURM_ERROR on failure.
Used when a controller is turned on to tell the storage type where the
slurmctld for a given cluster is located at.
db_conn (input) connection to
the storage type.
cluster (input) name of cluster.
port (input) port on host cluster is
running on the host is grabbed from the connection.
SLURM_SUCCESS on success, or
SLURM_ERROR on failure.
jobacct_storage_p_job_start() is called in the jobacct plugin when a
job starts, inserting information into the database about the new job.
db_conn (input) connection to
the storage type.
job_ptr (input) information about the job in
slurmctld.
SLURM_SUCCESS on success, or
SLURM_ERROR on failure.
jobacct_storage_p_job_complete() is called in the jobacct plugin when
a job completes, this updates info about end of a job.
db_conn (input) connection to
the storage type.
job_ptr (input) information about the job in
slurmctld.
SLURM_SUCCESS on success, or
SLURM_ERROR on failure.
jobacct_storage_p_step_start() is called in the jobacct plugin at the
allocation of a new step in the slurmctld, this inserts info about the
beginning of a step.
db_conn (input) connection to
the storage type.
step_ptr (input) information about the step in
slurmctld.
SLURM_SUCCESS on success, or
SLURM_ERROR on failure.
jobacct_storage_p_step_complete() is called in the jobacct plugin at
the end of a step in the slurmctld, this updates the ending
information about a step.
db_conn (input) connection to
the storage type.
step_ptr (input) information about the step in
slurmctld.
SLURM_SUCCESS on success, or
SLURM_ERROR on failure.
jobacct_storage_p_suspend() is called in the jobacct plugin when a
job is suspended or resumed in the slurmctld, this updates the
database about the suspended time of the job.
db_conn (input) connection to
the storage type.
job_ptr (input) information about the job in
slurmctld.
none
jobacct_storage_p_get_jobs_cond() is called to get a list of jobs from the
database given the conditional.
db_conn (input) connection to
the storage type.
uid (input) uid of user calling the function.
job_cond (input) conditional about
which jobs to get. Job ids should not need to be stated.
List of job_rec_t's on success, or
NULL on failure.
used to archive old data.
db_conn (input) connection to
the storage type.
arch_cond (input) conditional about
what to archive.
SLURM_SUCCESS on success, or
SLURM_ERROR on failure.
used to load old archive data.
db_conn (input) connection to
the storage type.
arch_rec (input) information about
what to load.
SLURM_SUCCESS on success, or
SLURM_ERROR on failure.
Used to update shares used in the storage type.
db_conn (input) connection to
the storage type.
acct_list (input) List of shares_used_object_t.
SLURM_SUCCESS on success, or
SLURM_ERROR on failure.
used to mark all jobs in the storage type as finished.
db_conn (input) connection to
the storage type.
cluster (input) name of cluster to
apply end to.
event_time (input) when the flush happened.
SLURM_SUCCESS on success, or
SLURM_ERROR on failure.
Parameters
Versioning