Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Next revision
Previous revision
doku:advanced_sge [2014/03/31 11:14] – external edit 127.0.0.1doku:advanced_sge [2015/01/22 09:57] (current) – [Job arrays with multiple task within one SGE task step] ir
Line 1: Line 1:
-===== Advanced SGE topics ===== 
-    * [[advanced_sge#Job chains|Job chains]] 
-    * [[advanced_sge#Job arrays|Job arrays]] 
-==== Copying back data from a temp. directory after a job is terminated ==== 
-Please refer to [[doku:epilog|user-defined epilog scripts]]! 
- 
 ==== Specifying the maximum runtime (means quicker job execution) / resource reservation ==== ==== Specifying the maximum runtime (means quicker job execution) / resource reservation ====
 === The problem === === The problem ===
Line 19: Line 13:
 Please put our precious computing time to use as best as you can. Please put our precious computing time to use as best as you can.
  
-==== Job chains ====+====== Job chains ======
 Job chains are sets of consecutive **interdependent** jobs. Job chains are sets of consecutive **interdependent** jobs.
  
Line 69: Line 63:
    *Note: -hold_jid <job_name> can only be used to reference jobs of the same user (-hold_jid <job_id> can be used to reference any job)    *Note: -hold_jid <job_name> can only be used to reference jobs of the same user (-hold_jid <job_id> can be used to reference any job)
  
-==== Job arrays ====+===== Job arrays =====
 Job arrays are sets of similar but **independent** jobs.  Submit sets of similar and independent “tasks”: Job arrays are sets of similar but **independent** jobs.  Submit sets of similar and independent “tasks”:
      *''qsub -t 1-500:1 example_3.sge'' submits 500 instances of the same script      *''qsub -t 1-500:1 example_3.sge'' submits 500 instances of the same script
Line 173: Line 167:
 #$ -N job_array_with_multilple single tasks on one node #$ -N job_array_with_multilple single tasks on one node
 #$ -pe mpich <N> #$ -pe mpich <N>
-### set first and last task_id and stepwidth of array tasks. stepwidth should be identical with the+### set first and last task_id and stepwidth of array tasks.  
 +### stepwidth should be identical with the
 ### number of jobs per node  ### number of jobs per node 
 #$ -t 1-18:3 #$ -t 1-18:3
Line 181: Line 176:
 do do
         TASK=`expr ${SGE_TASK_ID} + $i`         TASK=`expr ${SGE_TASK_ID} + $i`
-        CMD="run file_$TASK"+        CMD="run file_$TASK &"
         #or          #or 
-        #CMD="mpirun -np $SLOTS ./a.out $TASK"+        #CMD="mpirun -np $SLOTS ./a.out $TASK &"
         $CMD           $CMD  
 done done
 +wait
  
 </code> </code>
-==== Specifying runtime limits ====+====== Specifying runtime limits ======
  
 In SGE Jobs two runtime limits are available: soft (s_rt) and hard (h_rt) runtime limit In SGE Jobs two runtime limits are available: soft (s_rt) and hard (h_rt) runtime limit
Line 272: Line 268:
  
  
-==== Modyfying the machines file on VSC-1 ====+====== Modyfying the machines file on VSC-1 ======
  
 In cases when not all CPUs of one node are required, the machines file can be modified to guarantee the right behaviour of mpirun. The $TMPDIR/machines file on VSC-1 consists of a number of machine/node names. Each name stands for one CPU on the given machine/node. For an exclusive job on 2 nodes the machine file looks like: In cases when not all CPUs of one node are required, the machines file can be modified to guarantee the right behaviour of mpirun. The $TMPDIR/machines file on VSC-1 consists of a number of machine/node names. Each name stands for one CPU on the given machine/node. For an exclusive job on 2 nodes the machine file looks like:
Line 333: Line 329:
  
  
-==== Find out memory usage of running jobs ====+===== Find out memory usage of running jobs =====
  
 <code> <code>
Line 428: Line 424:
 </code> </code>
  
-==== Tight integration ====+===== Tight integration =====
  
 Some MPI implementations have a tight integration to SGE. In this case, providing a machinefile could/will be ignored by mpirun.  Some MPI implementations have a tight integration to SGE. In this case, providing a machinefile could/will be ignored by mpirun. 
Line 446: Line 442:
  
  
-==== Using Resources of foreign Projects ====+===== Using Resources of foreign Projects =====
  
 Sometimes users from one project are allowed by the project manager of another project to use its resources, ie. run jobs within this project by using the '-P <project_name>' flag in the job script.  Sometimes users from one project are allowed by the project manager of another project to use its resources, ie. run jobs within this project by using the '-P <project_name>' flag in the job script. 
Line 470: Line 466:
 The wrapper script will change your primary group to that given by the '-P' flag and submit the job.  The wrapper script will change your primary group to that given by the '-P' flag and submit the job. 
  
-==== Changing job parameters of already submitted jobs (qalter) ====+===== Changing job parameters of already submitted jobs (qalter) =====
  
 Jobs that have already been submitted to the queue can be modified using the 'qalter' command. Most common usage is changing queue or hard resources like runtime and free memory: Jobs that have already been submitted to the queue can be modified using the 'qalter' command. Most common usage is changing queue or hard resources like runtime and free memory:
  • doku/advanced_sge.txt
  • Last modified: 2015/01/22 09:57
  • by ir