Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
Last revisionBoth sides next revision
doku:job_arrays [2014/05/28 11:13] irdoku:job_arrays [2014/06/12 14:48] ir
Line 1: Line 1:
 ====== Job arrays ====== ====== Job arrays ======
  
-Job arrays are sets of similar but **independent** jobs.  Submit sets of similar and independent “tasks”+Job arrays are sets of similar but **independent** jobs that are submitted. 
-     *''qsub -t 1-500:1 example_3.sge'' submits 500 instances of the same script+ 
 +=== ExampleBLAST === 
 +**B**asic **L**ocal **A**lignment **S**earch **T**ool (BLAST; an algorithm for comparing biological sequence information) 
 + 
 +     *submit 500 instances of the same script
      *each instance (“task”) is executed independently      *each instance (“task”) is executed independently
      *all instances subsumed with a single job ID      *all instances subsumed with a single job ID
Line 8: Line 12:
      *task numbering scheme: ''-t <first>-<last>:<stepsize>''      *task numbering scheme: ''-t <first>-<last>:<stepsize>''
      *related: ''$SGE_TASK_FIRST,$SGE_TASK_LAST,$SGE_TASK_STEPSIZE''      *related: ''$SGE_TASK_FIRST,$SGE_TASK_LAST,$SGE_TASK_STEPSIZE''
 +
 +''qsub example_3.sge'' 
      
-Example: 
 <code> <code>
 #$ -cwd #$ -cwd
Line 61: Line 66:
 </code> </code>
  
 +==== Job arrays with multiple single core jobs on one exclusive node ====
  
 +In some cases job arrays with single core tasks require more memory than the per core memory of the 
 +compute nodes (3 GB on VSC-1, 2 GB on VSC-2). For such cases the jobscript below can be used. It starts several single core 
 +tasks on one node within a job array. Note the definition of the job stepwidth.
 +
 +<code>
 +#$ -N job_array_with_multilple single tasks on one node
 +###
 +### request single nodes, on vsc1 all nodes have 24 GB of memory:
 +### 8-core nodes:
 +#$ -pe mpich 8
 +### 12-core nodes:
 +### #$ -pe mpich 12
 +###
 +### set first and last task_id and stepwidth of array tasks. 
 +### stepwidth should be identical with the
 +### number of jobs per node 
 +#$ -t 1-18:3
 +
 +#optimum order for using single cpus
 +cpus=(0 4 2 6 1 5 3 7)
 +
 +for i in `seq 0 $( expr ${SGE_TASK_STEPSIZE} - 1 )`
 +do
 +        TASK=`expr ${SGE_TASK_ID} + $i`
 +        CMD="run file_$TASK"
 +        taskset -c ${cpus[$i]} $CMD  & 
 +done
 +#wait for all tasks to be finished before exiting the script
 +wait
 +
 +</code>
 +
 +
 +==== Job arrays with multiple task within one SGE task step ====
 +
 +In some cases, where a huge number of Job task need to be started and the task's runtime is very short, the 
 +following construction can be used. It starts several tasks, one after another, on
 +the specified nodes. Note the definition of the job stepwidth.
 +
 +<code>
 +#$ -N job_array_with_multilple single tasks on one node
 +#$ -pe mpich <N>
 +### set first and last task_id and stepwidth of array tasks. 
 +### stepwidth should be identical with the
 +### number of jobs per node 
 +#$ -t 1-18:3
 +
 +
 +for i in `seq 0 $( expr ${SGE_TASK_STEPSIZE} - 1 )`
 +do
 +        TASK=`expr ${SGE_TASK_ID} + $i`
 +        CMD="run file_$TASK"
 +        #or 
 +        #CMD="mpirun -np $SLOTS ./a.out $TASK"
 +        $CMD  
 +done
 +
 +</code>