IntraNodeDecomp
IntraNodeDecomp:
Specifies the domain decomposition to use on each compute node. The DomainDecomp will use the first IntraNodeDecomp block it owns; all others are ignored. If the IntraNodeDecomp block is omitted from the DomainDecomp, the domain will be partitioned across processors according to the regular decomposition algorithm described in the DomainDecomp section.
Note
When running simulations on multiple compute nodes, Vorpal assumes that all nodes are assigned the same number of MPI processes. An exception will be thrown if this is not the case. Support for non-homogeneous node architectures and decomposition is planned in the future.
IntraNodeDecomp Parameters
- allowedDirs (integer vector, optional, default = [0 1 2])
Sets the allowed directions to use in the splitting between processors. At least one valid direction must be assigned. Valid for regular` and sliced` kinds.
- kind (required, string)
Specifies type of decomposition to use, and is one of the following options.
regular
Evenly divides the node domain among processors, starting with the largest dimension, using the same default algorithm described in the DomainDecomp section.
sliced
Divides the node domain among processors using specified boundaries in one or more directions. If the number of available processors is not sufficient for the specified decomp, a regular decomposition is performed instead. If there is a free (unspecified) dimension and the number of processors is a multiple of the minimum required for the decomp, the domain will be automatically split along the free dimension(s) after being divided according to the specified cpu[X/Y/Z]Fracs.
cpuXFracs
(float vector) The processor boundaries in the 0 (x for Cartesian) dimension, expressed as a fraction of the node domain lengthcpuYFracs
(float vector) The processor boundaries in the 1 (y) dimension, expressed as a fraction of the node domain lengthcpuZFracs
(float vector) The processor boundaries in the 2 (z) dimension, expressed as a fraction of the node domain length
manual
Allows the user to specify NodeCpu blocks to manually define the intranode decomposition. Number of blocks must match the number of MPI ranks with which the simualation is run.
NodeCpu
(block) Block that describes the bounds for a single CPU rank via upperBounds and lowerBounds attributes (int vector).
Example IntraNodeDecomp Blocks
<IntraNodeDecomp autoYmanualZ>
kind = sliced
allowedDirs = [1 2]
cpuZFracs = [0.45 0.55]
</IntraNodeDecomp>
<IntraNodeDecomp manIntraDecomp>
kind = manual
<NodeCpu cpu0>
lowerBounds = [ 0 0 ]
upperBounds = [ 5 5 ]
</NodeCpu>
<NodeCpu cpu1>
lowerBounds = [ 5 0 ]
upperBounds = [ 10 5 ]
</NodeCpu>
<NodeCpu cpu2>
lowerBounds = [ 0 5 ]
upperBounds = [ 5 10 ]
</NodeCpu>
<NodeCpu cpu3>
lowerBounds = [ 5 5 ]
upperBounds = [ 10 10 ]
</NodeCpu>
</IntraNodeDecomp>