Differences
This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision | ||
doku:job_arrays [2014/06/12 14:46] – ir | doku:job_arrays [2021/05/13 17:46] (current) – removed goldenberg | ||
---|---|---|---|
Line 1: | Line 1: | ||
- | ====== Job arrays ====== | ||
- | Job arrays are sets of similar but **independent** jobs that are submitted. | ||
- | |||
- | ===== Headline ===== | ||
- | Example: | ||
- | **B**asic **L**ocal **A**lignment **S**earch **T**ool (BLAST; an algorithm for comparing biological sequence information) | ||
- | |||
- | | ||
- | *each instance (“task”) is executed independently | ||
- | *all instances subsumed with a single job ID | ||
- | | ||
- | *task numbering scheme: '' | ||
- | | ||
- | |||
- | '' | ||
- | | ||
- | < | ||
- | #$ -cwd | ||
- | #$ -N blastArray | ||
- | #$ -t 1-500:1 | ||
- | |||
- | QUERY=query_${SGE_TASK_ID}.fa | ||
- | OUTPUT=blastout_${SGE_TASK_ID}.txt | ||
- | echo " | ||
- | blastall -p blastn -d nt -i $QUERY -o $OUTPUT | ||
- | echo " | ||
- | </ | ||
- | |||
- | < | ||
- | user@l01 $ qsub example_3.sge | ||
- | Your job 10420.1-500: | ||
- | user@l01 $ qstat | ||
- | job-ID prior name | ||
- | ----------------------------------------------------------------------------------------------------------------- | ||
- | 10420 0.56000 blastArray mjr | ||
- | 10420 0.56000 blastArray mjr | ||
- | 10420 0.56000 blastArray mjr | ||
- | 10420 0.56000 blastArray mjr | ||
- | 10420 0.56000 blastArray mjr | ||
- | 10420 0.56000 blastArray mjr | ||
- | 10420 0.55241 blastArray mjr | ||
- | 10420 0.55241 blastArray mjr | ||
- | 10420 0.56000 blastArray mjr | ||
- | 10420 0.56000 blastArray mjr | ||
- | 10420 0.56000 blastArray mjr | ||
- | 10420 0.56000 blastArray mjr | ||
- | 10420 0.56000 blastArray mjr | ||
- | 10420 0.56000 blastArray mjr | ||
- | 10420 0.56000 blastArray mjr | ||
- | 10420 0.56000 blastArray mjr | ||
- | 10420 0.56000 blastArray mjr | ||
- | 10420 0.56000 blastArray mjr | ||
- | 10420 0.56000 blastArray mjr | ||
- | 10420 0.56000 blastArray mjr | ||
- | 10420 0.56000 blastArray mjr | ||
- | 10420 0.56000 blastArray mjr | ||
- | 10420 0.56000 blastArray mjr | ||
- | 10420 0.56000 blastArray mjr | ||
- | 10420 0.56000 blastArray mjr | ||
- | 10420 0.56000 blastArray mjr | ||
- | 10420 0.56000 blastArray mjr | ||
- | 10420 0.56000 blastArray mjr | ||
- | 10420 0.56000 blastArray mjr | ||
- | 10420 0.56000 blastArray mjr | ||
- | 10420 0.55242 blastArray mjr | ||
- | </ | ||
- | |||
- | ==== Job arrays with multiple single core jobs on one exclusive node ==== | ||
- | |||
- | In some cases job arrays with single core tasks require more memory than the per core memory of the | ||
- | compute nodes (3 GB on VSC-1, 2 GB on VSC-2). For such cases the jobscript below can be used. It starts several single core | ||
- | tasks on one node within a job array. Note the definition of the job stepwidth. | ||
- | |||
- | < | ||
- | #$ -N job_array_with_multilple single tasks on one node | ||
- | ### | ||
- | ### request single nodes, on vsc1 all nodes have 24 GB of memory: | ||
- | ### 8-core nodes: | ||
- | #$ -pe mpich 8 | ||
- | ### 12-core nodes: | ||
- | ### #$ -pe mpich 12 | ||
- | ### | ||
- | ### set first and last task_id and stepwidth of array tasks. | ||
- | ### stepwidth should be identical with the | ||
- | ### number of jobs per node | ||
- | #$ -t 1-18:3 | ||
- | |||
- | #optimum order for using single cpus | ||
- | cpus=(0 4 2 6 1 5 3 7) | ||
- | |||
- | for i in `seq 0 $( expr ${SGE_TASK_STEPSIZE} - 1 )` | ||
- | do | ||
- | TASK=`expr ${SGE_TASK_ID} + $i` | ||
- | CMD=" | ||
- | taskset -c ${cpus[$i]} $CMD & | ||
- | done | ||
- | #wait for all tasks to be finished before exiting the script | ||
- | wait | ||
- | |||
- | </ | ||
- | |||
- | |||
- | ==== Job arrays with multiple task within one SGE task step ==== | ||
- | |||
- | In some cases, where a huge number of Job task need to be started and the task's runtime is very short, the | ||
- | following construction can be used. It starts several tasks, one after another, on | ||
- | the specified nodes. Note the definition of the job stepwidth. | ||
- | |||
- | < | ||
- | #$ -N job_array_with_multilple single tasks on one node | ||
- | #$ -pe mpich <N> | ||
- | ### set first and last task_id and stepwidth of array tasks. | ||
- | ### stepwidth should be identical with the | ||
- | ### number of jobs per node | ||
- | #$ -t 1-18:3 | ||
- | |||
- | |||
- | for i in `seq 0 $( expr ${SGE_TASK_STEPSIZE} - 1 )` | ||
- | do | ||
- | TASK=`expr ${SGE_TASK_ID} + $i` | ||
- | CMD=" | ||
- | #or | ||
- | # | ||
- | $CMD | ||
- | done | ||
- | |||
- | </ |