==== Process pinning ====
The NUMA memory of VSC-2 is highly depending on the positioning of processes to the four ''NUMA nodes'' on each compute node.
Using Intel MPI the Parameter
export I_MPI_PIN_PROCESSOR_LIST=1,14,9,6,5,10,13,2,3,12,11,4,7,8,15,0
as mentioned above should always be used to pin (up to) 16 processes to the 16 cores.
In the case of sequential jobs, we recommend to use 'taskset' or 'numactl', e.g.
taskset -c 0 our_example_code param1 param2 >out1 &
taskset -c 8 our_example_code param1 param2 >out2 &
wait
Performance gains of up to 200% were observed for synthetic benchmarks.
Note also the examples for [[sequential-codes|sequential jobs]].