====== Accounting info of your project ======
===== Accounting script =====
The script **vsc4CoreHours.py** on VSC4 calculates the **elapsed core-hours per user** in your project and the **total amount of core-hours in your project**. The basic formula in this script takes into account the number of nodes per job and the time difference from start to end
{{ :doku:corehours.png?500 |}}
=== Usage of the script ===
You may give start time ''-S ...'' and end time ''-E ...''. Default start time on the clusters is the start of VSC-3, 2015-04-01T00:00:00, default end time is today. Instead you may give a duration ''-D d'' which gives you the core-hours within the past //d// days.
== Examples: ==
vsc4CoreHours.py # total project time span
vsc4CoreHours.py -D 7 # last week
vsc4CoreHours.py -S 2019-04-23 -E 2019-05-26T00:00:01
vsc4CoreHours.py -E 2020-05-26 # project start until 2020-05-26
vsc4CoreHours.py -S 2019-04-23 # 2019-04-23 until today
===== sacct =====
In order to customize your accounting request, the command ''sacct'' allows for assessing information from the SLURM job accounting log or SLURM database.
The default output values are jobs, job steps, status, and exit codes. By specifying the format, the output of ''sacct'' can be customized.
In the framework of this section only a minimal subset of options is listed.
For the full list see the SLURM documentation on the web ([[http://slurm.schedmd.com/sacct.html|sacct]]) or on the manual pages (''[username@l34 ~]$ man sacct'').
====Example 1: format specification====
[...@... ~]$ sacct -o Account,User,UID,AveCPUFreq,Elapsed,Start,End,TotalCPU
[...@... ~]$ sacct --format=Account,User,UID,AveCPUFreq,Elapsed,Start,End,TotalCPU
[...@... ~]$ sacct --format=JobID,UID,State,ExitCode
[...@... ~]$ sacct -o UID,User,Account,Group,JobID,JobName,Elapsed,Start,End
The options
-o # or
--format=
specify the format. Available formats are displayed with the options
-e, --helpformat
[...@... ~]$ sacct -e # or
[...@... ~]$ sacct --helpformat
A shortcut for showing all parameters is the option
-l, --long
which is equivalent to specifying ''-o jobid,jobname,partition,maxvmsize,maxvmsizenode,maxvmsizetask, avevmsize,maxrss,maxrssnode,maxrsstask,averss,maxpages,maxpagesnode, maxpagestask,avepages,mincpu,mincpunode,mincputask,avecpu,ntasks, alloccpus,elapsed,state,exitcode,maxdiskread,maxdiskreadnode,maxdiskreadtask, avediskread,maxdiskwrite,maxdiskwritenode,maxdiskwritetask,avediskwrite, allocgres,reqgres''
whereas minimum information (''-o jobid,status,exitcode'') is returned via the option
-b, --brief
====Example 2: start time|end time====
sacct -S YYYY-MM-DD[THH:MM[:SS]]
sacct -E YYYY-MM-DD[THH:MM[:SS]]
sacct -S 2015-05-18T09:00:01 -E 2015-05-18T12:02:01 -X -T
# Valid time formats are...
# HH:MM[:SS] [AM|PM]
# MMDD[YY] or MM/DD[/YY] or MM.DD[.YY]
# MM/DD[/YY]-HH:MM[:SS]
# YYYY-MM-DD[THH:MM[:SS]]
====Example 3: job state====
sacct -s R -S 2015-05-1917:00 -E 2015-05-1918:00 -X -T -o JobID,Start,End,State
In this example we ask for jobs which have been in the state running (''-s R'') in the given time interval (''-S ...'' start time and ''-E'' end time). ''-X'' and ''-T'' see [[doku:slurm_sacct_-t|below]]. ''-o ...'' see [[doku:slurm_sacctexample_1format_specification|above]]. The output may look like this
JobID Start End State
------------ ------------------- ------------------- ----------
616785 2015-05-19T17:01:33 2015-05-19T17:15:13 CANCELLED+
616835 2015-05-19T17:35:52 2015-05-19T18:00:00 RUNNING
... ... ... ...
616175_238 2015-05-19T17:52:00 2015-05-19T17:53:33 COMPLETED
616175_239 2015-05-19T17:52:02 2015-05-19T17:53:38 COMPLETED
616772_1 2015-05-19T17:52:16 2015-05-19T17:52:22 FAILED
616772_2 2015-05-19T17:52:22 2015-05-19T17:52:28 FAILED
The jobs in the given list //have been running in the selected time interval//, however, the column //state// reports the //present state at the moment of execution of the sacct command//.
Further possible parameters for the option ''-s'' are: BF BOOT_FAIL, CA CANCELLED, CD COMPLETED, CF CONFIGURING, CG COMPLETING, F FAILED, NF NODE_FAIL, PD PENDING, PR PREEMPTED, R RUNNING, RS RESIZING, S SUSPENDED, TO TIMEOUT
==== -X ====
The option
[...@... ~]$ sacct -X # or
[...@... ~]$ sacct --allocations
is useful because it shows only cumulative statistics for each job, not the intermediate steps.
==== -T ====
The option
[...@... ~]$ -T
[...@... ~]$ --truncate
is supposed to truncate time. If a job started before the optionally given start time ''-S YYYY-MM-DD[THH:MM[:SS]]'', the start time would be truncated to YYYY-MM-DD[THH:MM[:SS]]. The same for end time and ''-E YYYY-MM-DD[THH:MM[:SS]]''.
//We observed unexpected behavior of this option returning start times later than end times.//
==== Further options ====
-g gid_list, --gid=gid_list --group=group_list # e.g., p70815
-j job(.step) , --jobs=job(.step) # 618093.batch,615402.54
--name=jobname_list # display jobs that have any of these name(s)
-q, --qos # quality of service (QOS), e.g., normal_0064
-r, --partition= # e.g., mem_0064,mem_0256
-u uid_list, --uid=uid_list, --user=user_list # e.g., 74911
==== Job Memory Usage ====
sacct -j --format=JobID,MaxVMSize,MaxVMSizeNode,MaxVMSizeTask