User Tools

Site Tools


doku:job_arrays

Job arrays

Job arrays are sets of similar but independent jobs that are submitted.

Example: BLAST

Basic Local Alignment Search Tool (BLAST; an algorithm for comparing biological sequence information)

  • submit 500 instances of the same script
  • each instance (“task”) is executed independently
  • all instances subsumed with a single job ID
  • variable $SGE_TASK_ID discriminates between instances
  • task numbering scheme: -t <first>-<last>:<stepsize>
  • related: $SGE_TASK_FIRST,$SGE_TASK_LAST,$SGE_TASK_STEPSIZE

qsub example_3.sge

#$ -cwd
#$ -N blastArray
#$ -t 1-500:1

QUERY=query_${SGE_TASK_ID}.fa
OUTPUT=blastout_${SGE_TASK_ID}.txt
echo "processing query $QUERY ..."
blastall -p blastn -d nt -i $QUERY -o $OUTPUT
echo "...done"
user@l01 $ qsub example_3.sge
Your job 10420.1-500:1 ("blastArray") has been submitted.
user@l01 $ qstat
job-ID prior    name       user         state submit/start at     queue                          slots ja-task-ID
-----------------------------------------------------------------------------------------------------------------
  10420 0.56000 blastArray mjr           r     02/13/2007 15:05:56 all.q@r10n01                       1 198
  10420 0.56000 blastArray mjr           r     02/13/2007 15:05:56 all.q@r10n01                       1 199
  10420 0.56000 blastArray mjr           r     02/13/2007 15:07:11 all.q@r10n01                       1 202
  10420 0.56000 blastArray mjr           r     02/13/2007 15:07:11 all.q@r10n01                       1 203
  10420 0.56000 blastArray mjr           r     02/13/2007 15:05:41 all.q@r10n01                       1 196
  10420 0.56000 blastArray mjr           r     02/13/2007 15:05:41 all.q@r10n01                       1 197
  10420 0.55241 blastArray mjr           r     02/13/2007 15:08:41 all.q@r10n01                       1 208
  10420 0.55241 blastArray mjr           r     02/13/2007 15:08:41 all.q@r10n01                       1 209
  10420 0.56000 blastArray mjr           r     02/13/2007 15:08:11 all.q@r10n02                       1 204
  10420 0.56000 blastArray mjr           r     02/13/2007 15:08:11 all.q@r10n02                       1 206
  10420 0.56000 blastArray mjr           r     02/13/2007 15:02:11 all.q@r10n02                       1 176
  10420 0.56000 blastArray mjr           r     02/13/2007 15:02:11 all.q@r10n02                       1 177
  10420 0.56000 blastArray mjr           r     02/13/2007 15:03:26 all.q@r10n02                       1 182
  10420 0.56000 blastArray mjr           r     02/13/2007 15:03:26 all.q@r10n02                       1 183
  10420 0.56000 blastArray mjr           r     02/13/2007 15:07:11 all.q@r10n02                       1 200
  10420 0.56000 blastArray mjr           r     02/13/2007 15:07:11 all.q@r10n02                       1 201
  10420 0.56000 blastArray mjr           r     02/13/2007 15:05:11 all.q@r10n03                       1 193
  10420 0.56000 blastArray mjr           r     02/13/2007 15:05:11 all.q@r10n03                       1 194
  10420 0.56000 blastArray mjr           r     02/13/2007 15:04:41 all.q@r10n03                       1 190
  10420 0.56000 blastArray mjr           r     02/13/2007 15:04:41 all.q@r10n03                       1 191
  10420 0.56000 blastArray mjr           r     02/13/2007 15:03:41 all.q@r10n03                       1 184
  10420 0.56000 blastArray mjr           r     02/13/2007 15:03:41 all.q@r10n03                       1 185
  10420 0.56000 blastArray mjr           r     02/13/2007 15:08:11 all.q@r10n03                       1 205
  10420 0.56000 blastArray mjr           r     02/13/2007 15:08:11 all.q@r10n03                       1 207
  10420 0.56000 blastArray mjr           r     02/13/2007 15:05:11 all.q@r10n04                       1 192
  10420 0.56000 blastArray mjr           r     02/13/2007 15:05:26 all.q@r10n04                       1 195
  10420 0.56000 blastArray mjr           r     02/13/2007 15:04:26 all.q@r10n04                       1 188
  10420 0.56000 blastArray mjr           r     02/13/2007 15:04:26 all.q@r10n04                       1 189
  10420 0.56000 blastArray mjr           r     02/13/2007 15:03:56 all.q@r10n04                       1 186
  10420 0.56000 blastArray mjr           r     02/13/2007 15:03:56 all.q@r10n04                       1 187
  10420 0.55242 blastArray mjr           qw    02/13/2007 14:28:34                                    1 210-500:1

Job arrays with multiple single core jobs on one exclusive node

In some cases job arrays with single core tasks require more memory than the per core memory of the compute nodes (3 GB on VSC-1, 2 GB on VSC-2). For such cases the jobscript below can be used. It starts several single core tasks on one node within a job array. Note the definition of the job stepwidth.

#$ -N job_array_with_multilple single tasks on one node
###
### request single nodes, on vsc1 all nodes have 24 GB of memory:
### 8-core nodes:
#$ -pe mpich 8
### 12-core nodes:
### #$ -pe mpich 12
###
### set first and last task_id and stepwidth of array tasks. 
### stepwidth should be identical with the
### number of jobs per node 
#$ -t 1-18:3

#optimum order for using single cpus
cpus=(0 4 2 6 1 5 3 7)

for i in `seq 0 $( expr ${SGE_TASK_STEPSIZE} - 1 )`
do
        TASK=`expr ${SGE_TASK_ID} + $i`
        CMD="run file_$TASK"
        taskset -c ${cpus[$i]} $CMD  & 
done
#wait for all tasks to be finished before exiting the script
wait

Job arrays with multiple task within one SGE task step

In some cases, where a huge number of Job task need to be started and the task's runtime is very short, the following construction can be used. It starts several tasks, one after another, on the specified nodes. Note the definition of the job stepwidth.

#$ -N job_array_with_multilple single tasks on one node
#$ -pe mpich <N>
### set first and last task_id and stepwidth of array tasks. 
### stepwidth should be identical with the
### number of jobs per node 
#$ -t 1-18:3


for i in `seq 0 $( expr ${SGE_TASK_STEPSIZE} - 1 )`
do
        TASK=`expr ${SGE_TASK_ID} + $i`
        CMD="run file_$TASK"
        #or 
        #CMD="mpirun -np $SLOTS ./a.out $TASK"
        $CMD  
done
doku/job_arrays.txt · Last modified: 2014/06/12 14:48 by ir