Working With MPI¶
MPI, short for Message Passing Interface, is a communication protocol for parallel applications that is widely used in HPC. This tutorial is intended for administrators who want to enable MPI support on their endpoint, as well as users who want to submit MPI jobs to such an endpoint.
Note
Compute’s MPI system is a wrapper over Parsl’s MPI system. For more details on how the latter works, see the Parsl documentation.
Configure an Endpoint¶
If you are starting from scratch, you will need to initialize an endpoint with
the configure subcommand:
$ globus-compute-endpoint configure my-ep
Modify the Configuration Template¶
Note
For more details on MPI-related configuration options, see Configuring for MPI.
Start with a clean user_config_template.yaml.j2:
engine:
type: GlobusComputeEngine
max_workers_per_node: 1
provider:
type: LocalProvider
min_blocks: 0
max_blocks: 1
init_blocks: 1
Update the config to use GlobusMPIEngine and SimpleLauncher:
engine:
type: GlobusMPIEngine
max_workers_per_node: 1
provider:
type: LocalProvider
launcher:
type: SimpleLauncher
min_blocks: 0
max_blocks: 1
init_blocks: 1
Depending on the target system, set the correct provider and mpi_launcher. For this
tutorial, we’ll use Slurm; check Example Configurations for examples based
on other schedulers.
engine:
type: GlobusMPIEngine
mpi_launcher: srun
max_workers_per_node: 1
provider:
type: SlurmProvider
launcher:
type: SimpleLauncher
min_blocks: 0
max_blocks: 1
init_blocks: 1
Finally, configure the shape of the resources available to MPI tasks. Set
nodes_per_block to configure the size of the block Parsl will reserve for MPI
tasks, and set max_workers_per_block to limit how many MPI tasks can be run per
block.
engine:
type: GlobusMPIEngine
mpi_launcher: srun
provider:
type: SlurmProvider
launcher:
type: SimpleLauncher
max_workers_per_block: 4
nodes_per_block: 8
For this example we give each block 8 nodes, and allow up to 4 MPI jobs at once on a single block.
Start the Endpoint¶
Once the endpoint is configured, we can start it up.
$ globus-compute-endpoint start my-ep
Take note of the endpoint ID emitted to the console; we will use it later in the tutorial.
Submit Tasks from the SDK¶
Note
For more details on MPI support on the SDK, see Submitting MPI Tasks.
We’ll use the Executor to submit our MPI tasks, but first, we need to define our MPI
function. In this case, we’ll just run hostname on every MPI node:
from globus_compute_sdk import MPIFunction
mpi_func = MPIFunction("hostname")
An MPIFunction can be submitted like any other Python function. When submitted, it
runs bash commands on the endpoint, with the appropriate MPI executable and arguments
handled by Parsl.
In order to run MPI tasks we need to give our Executor a resource_specification,
which tells the endpoint how to distribute the nodes amongst MPI workers:
ep_id = "..." # Endpoint ID from before
with Executor(endpoint_id=ep_id) as ex:
ex.resource_specification = {
"num_ranks": 1, # run 1 MPI task
"num_nodes": 8 # and give it 8 nodes
}
f = ex.submit(mpi_func)
mpi_result = f.result()
print(mpi_result.stdout)
MPIFunction submissions return ShellResult objects, hence the .stdout.
Finally, each task can have its own resource_specification:
with Executor(endpoint_id=ep_id) as ex:
for ranks in range(1, 4): # reminder: (1, 2, 3). 4 not included
# run "ranks" MPI tasks on each node
ex.resource_specification = {
"ranks_per_node": ranks,
"num_nodes": 2
}
f = ex.submit(mpi_func)
mpi_result = f.result()
print(mpi_result.stdout)
This should result in output that looks something like the following:
# 2 nodes, 1 rank
my-node-1
my-node-2
# 2 nodes, 2 ranks
my-node-2
my-node-1
my-node-1
my-node-2
# 2 nodes, 3 ranks
my-node-1
my-node-2
my-node-1
my-node-2
my-node-2
my-node-1