Differences

This shows you the differences between two versions of the page.

--- doku:job_arrays [2014/05/28 11:13] – ir
+++ doku:job_arrays [2014/06/12 14:48] – ir
@@ Line 1: / Line 1: @@
 ====== Job arrays ======
-Job arrays are sets of similar but **independent** jobs.  Submit sets of similar and independent “tasks”:
+Job arrays are sets of similar but **independent** jobs that are submitted.
-     *''qsub -t 1-500:1 example_3.sge'' submits 500 instances of the same script
+=== Example: BLAST ===
+**B**asic **L**ocal **A**lignment **S**earch **T**ool (BLAST; an algorithm for comparing biological sequence information)
+     *submit 500 instances of the same script
      *each instance (“task”) is executed independently
      *all instances subsumed with a single job ID
@@ Line 8: / Line 12: @@
      *task numbering scheme: ''-t <first>-<last>:<stepsize>''
      *related: ''$SGE_TASK_FIRST,$SGE_TASK_LAST,$SGE_TASK_STEPSIZE''
+''qsub example_3.sge''
-Example:
 <code>
 #$ -cwd
@@ Line 61: / Line 66: @@
 </code>
+==== Job arrays with multiple single core jobs on one exclusive node ====
+In some cases job arrays with single core tasks require more memory than the per core memory of the
+compute nodes (3 GB on VSC-1, 2 GB on VSC-2). For such cases the jobscript below can be used. It starts several single core
+tasks on one node within a job array. Note the definition of the job stepwidth.
+<code>
+#$ -N job_array_with_multilple single tasks on one node
+###
+### request single nodes, on vsc1 all nodes have 24 GB of memory:
+### 8-core nodes:
+#$ -pe mpich 8
+### 12-core nodes:
+### #$ -pe mpich 12
+###
+### set first and last task_id and stepwidth of array tasks.
+### stepwidth should be identical with the
+### number of jobs per node
+#$ -t 1-18:3
+#optimum order for using single cpus
+cpus=(0 4 2 6 1 5 3 7)
+for i in `seq 0 $( expr ${SGE_TASK_STEPSIZE} - 1 )`
+do
+        TASK=`expr ${SGE_TASK_ID} + $i`
+        CMD="run file_$TASK"
+        taskset -c ${cpus[$i]} $CMD  &
+done
+#wait for all tasks to be finished before exiting the script
+wait
+</code>
+==== Job arrays with multiple task within one SGE task step ====
+In some cases, where a huge number of Job task need to be started and the task's runtime is very short, the
+following construction can be used. It starts several tasks, one after another, on
+the specified nodes. Note the definition of the job stepwidth.
+<code>
+#$ -N job_array_with_multilple single tasks on one node
+#$ -pe mpich <N>
+### set first and last task_id and stepwidth of array tasks.
+### stepwidth should be identical with the
+### number of jobs per node
+#$ -t 1-18:3
+for i in `seq 0 $( expr ${SGE_TASK_STEPSIZE} - 1 )`
+do
+        TASK=`expr ${SGE_TASK_ID} + $i`
+        CMD="run file_$TASK"
+        #or
+        #CMD="mpirun -np $SLOTS ./a.out $TASK"
+        $CMD
+done
+</code>