device-selection

Device Selection

Determines how the simulation will use any found compute devices.

Example block:

<DeviceSelection devSel>
  maxCpuThreadsPerProc = 0
  maxCudaGpusPerProc = 1
  NvidiaGpus = [0 1 2 3 4 5 6 7]
  NvidiaGpuSpeeds = [200. 200. 200. 200. 200. 200. 200. 200.]
  devicesPlacement = 1
</DeviceSelection>

DeviceSelection Parameters

maxCpuThreadsPerProc (integer, optional, default = 1)

The maximum number of computing CPU threads assigned to each process. This should be set to 1 for a pure-CPU computation. It should be set to 0 for a pure GPU computation. If a process (core) has no GPU assigned to it, it will be set to 1.

maxGpusPerProc (integer, optional, default = 1)

The maximum number of GPUs to be assigned to a process. Set this to 0 for a pure GPU computation, which will then cause maxCpuThreadsPerProc to be set to 1. For a pure GPU computation, set this to 1, but then if no GPUs are found, it is set back to 0. For non-implicit solves, it is possible for this to be greater than 1, where multiple GPUs are managed by a single rank, but this is not recommended.

NvidiaGpus (integer vector, optional, default = 1)

A vector of the allowed GPU device numbers for this computation.

NvidiaGpuSpeeds (float vector, optional, default = 200.)

A vector of the speeds of the allowed GPUs relative to one core.

devicesPlacement (integer, optional, default = 1)

0: Deprecated. Available GPUs are assigned to all cores.

1: Preferred. Available GPUs are assigned to the last computational cores on a node. E.g. with 4 GPUs on a node with 6 cores: Process 0 has no GPU, computes on CPU core. Process 1 has no GPU, computes on CPU core. Process 2 is assigned GPU-0. Process 3 is assigned GPU-1. Process 4 is assigned GPU-2. Process 5 is assigned GPU-3.