Differences
This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision Next revisionBoth sides next revision | ||
doku:gromacs [2023/05/17 12:50] – msiegel | doku:gromacs [2023/05/17 15:56] – [Installations] msiegel | ||
---|---|---|---|
Line 11: | Line 11: | ||
===== GPU Partition ===== | ===== GPU Partition ===== | ||
- | First you have to decide on which hardware GROMACS should run, we call | + | First you have to decide on which hardware GROMACS should run, we call this a '' |
- | this a '' | + | |
- | SLURM]]. On any login node, type '' | + | |
- | available partitions, or take a look at [[doku: | + | |
- | see the example below. Be aware that each partition has different | + | |
- | hardware, so choose the parameters accordingly. GROMACS decides mostly on its own how it wants to | + | |
- | work, so don't be surprised if it ignores settings like environment | + | |
- | variables. | + | |
===== Installations ===== | ===== Installations ===== | ||
- | |||
- | Type '' | ||
We provide the following GROMACS installations: | We provide the following GROMACS installations: | ||
- | * '' | + | * '' |
- | * '' | + | * '' |
+ | |||
+ | Type '' | ||
+ | |||
+ | Because of the low efficiency of GROMACS on many nodes with many GPUs via MPI, we do not provide '' | ||
- | Because of the low efficiency of GROMACS on many nodes with many GPUs via MPI, we do not provide '' | ||
===== Batch Script ===== | ===== Batch Script ===== | ||
Line 56: | Line 50: | ||
</ | </ | ||
- | Type '' | + | Type '' |
- | [[doku: | + | |
- | executed automatically. | + | |
Line 66: | Line 58: | ||
==== CPU / GPU Load ==== | ==== CPU / GPU Load ==== | ||
- | There is a whole page dedicated to [[doku: | + | There is a whole page dedicated to [[doku: |
- | GPU, for GROMACS the relevant sections are section | + | |
- | [[doku: | + | |
==== Short Example ==== | ==== Short Example ==== | ||
- | As a short example we ran '' | + | As a short example we ran '' |
- | different options, where '' | + | |
- | we don't actually care about the result, we just want to know how many **ns/day** we can get, GROMACS tells you that at the end of every run. Such a short test can be done in no time. | + | |
- | The following table lists our 5 tests: Without any options GROMACS | + | The following table lists our 5 tests: Without any options GROMACS already runs fine (a). Setting the number of tasks (b) is not needed; if set wrong can even slow the calculation down significantly ( c ) due to over provisioning! We would advise to enforce pinning, in our example it does not show any effects though (d), we assume that the tasks are pinned automatically already. The only further improvement we could get was using the '' |
- | already runs fine (a). Setting the number of tasks (b) is not needed; | + | |
- | if set wrong can even slow the calculation down significantly ( c ) due | + | |
- | to over provisioning! We would advise to enforce pinning, in our | + | |
- | example it does not show any effects though (d), we assume that the | + | |
- | tasks are pinned automatically already. The only further improvement | + | |
- | we could get was using the '' | + | |
- | load on the GPU (e). | + | |
^ # ^ cmd ^ ns / day ^ cpu load / % ^ gpu load / % ^ notes ^ | ^ # ^ cmd ^ ns / day ^ cpu load / % ^ gpu load / % ^ notes ^ | ||
Line 251: | Line 232: | ||
In most cases one node is **better** than more nodes. | In most cases one node is **better** than more nodes. | ||
- | In some cases, for example a large molecule like Test 7, you might want to run GROMACS on | + | In some cases, for example a large molecule like Test 7, you might want to run GROMACS on multiple nodes in parallel using MPI, with multiple GPUs (one each node). We strongly encourage you to test if you actually benefit from running with GPUs on many nodes. GROMACS can perform worse on many nodes in parallel than on a single one, even considerably! |
- | multiple nodes in parallel using MPI, with multiple | + | |
- | GPUs (one each node). We strongly encourage you to test if you | + | |
- | actually benefit from running with GPUs on many nodes. GROMACS can perform worse on | + | |
- | many nodes in parallel than on a single one, even considerably! | + | |
Run GROMACS on multiple nodes with: | Run GROMACS on multiple nodes with: | ||
Line 347: | Line 324: | ||
</ | </ | ||
- | The reason for this is that the graphics cards does more work than the CPU. GROMACS needs to copy data between | + | The reason for this is that the graphics cards does more work than the CPU. GROMACS needs to copy data between different ranks on the CPUs and all GPUs, which takes more time with more ranks. GROMACS notices that and shows '' |
- | different ranks on the CPUs and all GPUs, which takes more time with more ranks. GROMACS notices that and shows | + | |
- | '' | + | |
- | 1 with 16 ranks on 1 node: the '' | + | |
- | of the time spent! | + | |
< | < |