- device-selection
Device Selection
Determines how the simulation will use any found compute devices.
Example block:
<DeviceSelection devSel>
maxCpuThreadsPerProc = 0
maxCudaGpusPerProc = 1
NvidiaGpus = [0 1 2 3 4 5 6 7]
NvidiaGpuSpeeds = [200. 200. 200. 200. 200. 200. 200. 200.]
devicesPlacement = 1
</DeviceSelection>
DeviceSelection Parameters
- maxCpuThreadsPerProc (integer, optional, default = 1)
The maximum number of computing CPU threads assigned to each process. This should be set to 1 for a pure-CPU computation. It should be set to 0 for a pure GPU computation. If a process (core) has no GPU assigned to it, it will be set to 1.
- maxGpusPerProc (integer, optional, default = 1)
The maximum number of GPUs to be assigned to a process. Set this to 0 for a pure GPU computation, which will then cause maxCpuThreadsPerProc to be set to 1. For a pure GPU computation, set this to 1, but then if no GPUs are found, it is set back to 0. For non-implicit solves, it is possible for this to be greater than 1, where multiple GPUs are managed by a single rank, but this is not recommended.
- NvidiaGpus (integer vector, optional, default = 1)
A vector of the allowed GPU device numbers for this computation.
- NvidiaGpuSpeeds (float vector, optional, default = 200.)
A vector of the speeds of the allowed GPUs relative to one core.
- devicesPlacement (integer, optional, default = 1)
0: Deprecated. Available GPUs are assigned to all cores.
1: Preferred. Available GPUs are assigned to the last computational cores on a node. E.g. with 4 GPUs on a node with 6 cores: Process 0 has no GPU, computes on CPU core. Process 1 has no GPU, computes on CPU core. Process 2 is assigned GPU-0. Process 3 is assigned GPU-1. Process 4 is assigned GPU-2. Process 5 is assigned GPU-3.