This version is outdated by a newer approved version.DiffThis version (2015/05/28 13:52) is a draft.
Approvals: 0/1

This is an old revision of the document!


Accounting Info of your project

The script vsc3CoreHours.py calculates the elapsed core-hours per user in your project and the total amount of core-hours in your project. The basic formula in this script takes into account the number of nodes per job and the time difference from start to end

Usage of the script

You may give start time -S …“ and end time ”-E …“. default start time is the start of VSC-3, 2015-04-01T00:00:00, default end time is today. Instead you may give a duration -D d which gives you the core-hours within the past d days. == Examples: == <code> /opt/sw/x86_64/generic/bin/vsc3CoreHours.py -S 2015-04-23 -E 2015-05-26T00:00:01 /opt/sw/x86_64/generic/bin/vsc3CoreHours.py -S 2015-04-23 # default end is today /opt/sw/x86_64/generic/bin/vsc3CoreHours.py -D 7 # last week </code> <code> sacct -s R -S <project start> -E <today> -X -o JobID,Start,End,NNodes </code> ===== sacct ===== The command sacct allows for assessing information from the SLURM job accounting log or SLURM database. The default output values are jobs, job steps, status, and exit codes. By specifying the format, the output of sacct can be customized. In the framework of this section only a minimal subset of options is listed. For the full list see the SLURM documentation on the web (sacct) or on the manual pages ([username@l34 ~]$ man sacct). ====Example 1: format specification==== <code> […@… ~]$ sacct -o Account,User,UID,AveCPUFreq,Elapsed,Start,End,TotalCPU […@… ~]$ sacct –format=Account,User,UID,AveCPUFreq,Elapsed,Start,End,TotalCPU […@… ~]$ sacct –format=JobID,UID,State,ExitCode […@… ~]$ sacct -o UID,User,Account,Group,JobID,JobName,Elapsed,Start,End </code> The options <code>-o<space|comma-separated list of formats like in the example above> # or –format=<no space|comma-separated list of formats> </code> specify the format. Available formats are displayed with the options <code> -e, –helpformat […@… ~]$ sacct -e # or […@… ~]$ sacct –helpformat </code> A shortcut for showing all parameters is the option <code> -l, –long </code> which is equivalent to specifying -o jobid,jobname,partition,maxvmsize,maxvmsizenode,maxvmsizetask, avevmsize,maxrss,maxrssnode,maxrsstask,averss,maxpages,maxpagesnode, maxpagestask,avepages,mincpu,mincpunode,mincputask,avecpu,ntasks, alloccpus,elapsed,state,exitcode,maxdiskread,maxdiskreadnode,maxdiskreadtask, avediskread,maxdiskwrite,maxdiskwritenode,maxdiskwritetask,avediskwrite, allocgres,reqgres whereas minimum information (-o jobid,status,exitcode) is returned via the option <code> -b, –brief </code> ====Example 2: start time|end time==== <code> sacct -S YYYY-MM-DD[THH:MM[:SS]] sacct -E YYYY-MM-DD[THH:MM[:SS]] sacct -S 2015-05-18T09:00:01 -E 2015-05-18T12:02:01 -X -T # Valid time formats are… # HH:MM[:SS] [AM|PM] # MMDD[YY] or MM/DD[/YY] or MM.DD[.YY] # MM/DD[/YY]-HH:MM[:SS] # YYYY-MM-DD[THH:MM[:SS]] </code> ====Example 3: job state==== <code> sacct -s R -S 2015-05-1917:00 -E 2015-05-1918:00 -X -T -o JobID,Start,End,State </code> In this example we ask for jobs which have been in the state running (-s R) in the given time interval (-S … start time and -E end time). -X and -T see below. -o … see above. The output may look like this <code> JobID Start End State ———— ——————- ——————- ———- 616785 2015-05-19T17:01:33 2015-05-19T17:15:13 CANCELLED+ 616835 2015-05-19T17:35:52 2015-05-19T18:00:00 RUNNING … … … … 616175_238 2015-05-19T17:52:00 2015-05-19T17:53:33 COMPLETED 616175_239 2015-05-19T17:52:02 2015-05-19T17:53:38 COMPLETED 616772_1 2015-05-19T17:52:16 2015-05-19T17:52:22 FAILED 616772_2 2015-05-19T17:52:22 2015-05-19T17:52:28 FAILED </code> The jobs in the given list have been running in the selected time interval, however, the column state reports the present state at the moment of execution of the sacct command. Further possible parameters for the option -s are: BF BOOT_FAIL, CA CANCELLED, CD COMPLETED, CF CONFIGURING, CG COMPLETING, F FAILED, NF NODE_FAIL, PD PENDING, PR PREEMPTED, R RUNNING, RS RESIZING, S SUSPENDED, TO TIMEOUT ==== -X -T ==== The option <code> […@… ~]$ sacct -X # or […@… ~]$ sacct –allocations </code> is useful because it shows only cumulative statistics for each job, not the intermediate steps. The option <code> […@… ~]$ -T […@… ~]$ –truncate </code> truncates time. If a job started before the optionally given start time -S YYYY-MM-DD[THH:MM[:SS]], the start time would be truncated to YYYY-MM-DD[THH:MM[:SS]]. The same for end time and -E YYYY-MM-DD[THH:MM[:SS]]''.

-g gid_list, --gid=gid_list --group=group_list # e.g., p70815
-j job(.step) , --jobs=job(.step)              # 618093.batch,615402.54
--name=jobname_list        # display jobs that have any of these name(s)
-q, --qos                  # quality of service (QOS), e.g., normal_0064
-r, --partition=           # e.g., mem_0064,mem_0256
-u uid_list, --uid=uid_list, --user=user_list  # e.g., 74911
  • doku/slurm_sacct.1432821174.txt.gz
  • Last modified: 2015/05/28 13:52
  • by ir