Differences
This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revisionLast revisionBoth sides next revision | ||
doku:job_arrays [2014/05/28 11:13] – ir | doku:job_arrays [2014/06/12 14:48] – ir | ||
---|---|---|---|
Line 1: | Line 1: | ||
====== Job arrays ====== | ====== Job arrays ====== | ||
- | Job arrays are sets of similar but **independent** jobs. | + | Job arrays are sets of similar but **independent** jobs that are submitted. |
- | *'' | + | |
+ | === Example: BLAST === | ||
+ | **B**asic **L**ocal **A**lignment **S**earch **T**ool (BLAST; an algorithm for comparing biological sequence information) | ||
+ | |||
+ | *submit | ||
*each instance (“task”) is executed independently | *each instance (“task”) is executed independently | ||
*all instances subsumed with a single job ID | *all instances subsumed with a single job ID | ||
Line 8: | Line 12: | ||
*task numbering scheme: '' | *task numbering scheme: '' | ||
| | ||
+ | |||
+ | '' | ||
| | ||
- | Example: | ||
< | < | ||
#$ -cwd | #$ -cwd | ||
Line 61: | Line 66: | ||
</ | </ | ||
+ | ==== Job arrays with multiple single core jobs on one exclusive node ==== | ||
+ | In some cases job arrays with single core tasks require more memory than the per core memory of the | ||
+ | compute nodes (3 GB on VSC-1, 2 GB on VSC-2). For such cases the jobscript below can be used. It starts several single core | ||
+ | tasks on one node within a job array. Note the definition of the job stepwidth. | ||
+ | |||
+ | < | ||
+ | #$ -N job_array_with_multilple single tasks on one node | ||
+ | ### | ||
+ | ### request single nodes, on vsc1 all nodes have 24 GB of memory: | ||
+ | ### 8-core nodes: | ||
+ | #$ -pe mpich 8 | ||
+ | ### 12-core nodes: | ||
+ | ### #$ -pe mpich 12 | ||
+ | ### | ||
+ | ### set first and last task_id and stepwidth of array tasks. | ||
+ | ### stepwidth should be identical with the | ||
+ | ### number of jobs per node | ||
+ | #$ -t 1-18:3 | ||
+ | |||
+ | #optimum order for using single cpus | ||
+ | cpus=(0 4 2 6 1 5 3 7) | ||
+ | |||
+ | for i in `seq 0 $( expr ${SGE_TASK_STEPSIZE} - 1 )` | ||
+ | do | ||
+ | TASK=`expr ${SGE_TASK_ID} + $i` | ||
+ | CMD=" | ||
+ | taskset -c ${cpus[$i]} $CMD & | ||
+ | done | ||
+ | #wait for all tasks to be finished before exiting the script | ||
+ | wait | ||
+ | |||
+ | </ | ||
+ | |||
+ | |||
+ | ==== Job arrays with multiple task within one SGE task step ==== | ||
+ | |||
+ | In some cases, where a huge number of Job task need to be started and the task's runtime is very short, the | ||
+ | following construction can be used. It starts several tasks, one after another, on | ||
+ | the specified nodes. Note the definition of the job stepwidth. | ||
+ | |||
+ | < | ||
+ | #$ -N job_array_with_multilple single tasks on one node | ||
+ | #$ -pe mpich <N> | ||
+ | ### set first and last task_id and stepwidth of array tasks. | ||
+ | ### stepwidth should be identical with the | ||
+ | ### number of jobs per node | ||
+ | #$ -t 1-18:3 | ||
+ | |||
+ | |||
+ | for i in `seq 0 $( expr ${SGE_TASK_STEPSIZE} - 1 )` | ||
+ | do | ||
+ | TASK=`expr ${SGE_TASK_ID} + $i` | ||
+ | CMD=" | ||
+ | #or | ||
+ | # | ||
+ | $CMD | ||
+ | done | ||
+ | |||
+ | </ |