Multi-User Endpoints

This guide builds on the Endpoint User Guide to cover the additional features and configuration steps needed to set up multi-user endpoints.

Multi-user endpoints use the same installation process as regular endpoints. See Installing the Endpoint for more information.

Tip

For those just looking to get up and running, see the Administrator Quickstart, below.

Overview

Multi-user endpoints enable administrators to securely offer compute resources to users without requiring shell access (e.g., SSH). These endpoints support all features described in the Endpoint User Guide page, plus the ability to map Globus Auth identities of function-submitting users to local POSIX user accounts, then launch user endpoint processes as those mapped users.

For a detailed look at how a task makes its way to a user endpoint process, see Tracing a Task to the Endpoint.

Key Benefits

For Administrators

The biggest benefit of a multi-user endpoint setup is a lowering of the barrier for legitimate users of a site. To date, knowledge of the command line has been critical to most users of High Performance Computing (HPC) systems, though only as a necessity of infrastructure rather than a legitimate scientific purpose. A multi-user endpoint allows a user to ignore many of the important-but-not-really details of plumbing, like logging in through SSH, restarting user-only daemons. The only thing they need to do is run their scripts locally on their own workstation, and the rest “just works.”

Another boon for administrators is the ability to fine-tune and pre-configure what resources users may utilize. For example, many users struggle to discover which interface is routed to a cluster’s internal network; the administrator can preset that, completely bypassing the question. Using ALCF’s Polaris as an example, the administrator could use the following user configuration template (user_config_template.yaml.j2) to place all jobs sent to this multi-user endpoint on the debug-scaling queue, and pre-select the obvious defaults (per the documentation):

/root/.globus_compute/debug_scaling_ep/user_config_template.yaml.j2
display_name: Polaris at ALCF - debug-scaling queue
engine:
  type: GlobusComputeEngine
  address:
    type: address_by_interface
    ifname: bond0

  strategy:
    type: SimpleStrategy
    max_idletime: 30

  provider:
    type: PBSProProvider
    queue: debug-scaling

    account: {{ ACCOUNT_ID }}

    # Command to be run before starting a worker
    # e.g., "module load Anaconda; source activate parsl_env"
    worker_init: {{ WORKER_INIT_COMMAND|default() }}

    init_blocks: 0
    min_blocks: 0
    max_blocks: 1
    nodes_per_block: {{ NODES_PER_BLOCK|default(1) }}

    walltime: 1:00:00

    launcher:
      type: MpiExecLauncher

idle_heartbeats_soft: 10
idle_heartbeats_hard: 5760

The user must specify the ACCOUNT_ID, and could optionally specify the WORKER_INIT_COMMAND and NODES_PER_BLOCK variables. If the user’s jobs finish and no more work comes in after max_idletime seconds (30s), the user endpoint process will scale down and consume no more wall time.

For Users

Under the multi-user paradigm, users largely benefit from not having to be quite so aware of an endpoint and its configuration. As the administrator will have taken care of most of the smaller details (c.f., installation, internal interfaces, queue policies), the user is able to write a consuming script, knowing only the endpoint ID and their system accounting username:

import concurrent.futures
from globus_compute_sdk import Executor

def jitter_double(task_num):
    import random
    return task_num, task_num * (1.5 + random.random())

polaris_site_id = "..."  # as acquired from the admin in the previous section
with Executor(
    endpoint_id=polaris_site_id,
    user_endpoint_config={
        "ACCOUNT_ID": "user_allocation_account_id",
        "NODES_PER_BLOCK": 2,
    }
) as ex:
    futs = [ex.submit(jitter_double, task_num) for task_num in range(100)]
    for fut in concurrent.futures.as_completed(futs):
        print("Result:", fut.result())

It is a boon for the researcher to see the relevant configuration variables immediately adjacent to the code, as opposed to hidden in the endpoint configuration and behind an opaque endpoint ID. A multi-user endpoint removes almost half of the infrastructure plumbing that the user must manage — many users will barely even need to open their own terminal, much less an SSH terminal on a login node.

Configuring a Multi-User Endpoint

The configure subcommand must be run as a privileged user (e.g., root) to properly generate the config.yaml and example_identity_mapping_config.json files, along with other default files in $HOME/.globus_compute/:

# gce configure my_mu_ep
Created profile for endpoint named <my_mu_ep>

    Configuration file: /root/.globus_compute/my_mu_ep/config.yaml

    Example identity mapping configuration: /root/.globus_compute/my_mu_ep/example_identity_mapping_config.json

    User endpoint configuration template: /root/.globus_compute/my_mu_ep/user_config_template.yaml.j2
    User endpoint configuration schema: /root/.globus_compute/my_mu_ep/user_config_schema.json
    User endpoint environment variables: /root/.globus_compute/my_mu_ep/user_environment.yaml

Use the `start` subcommand to run it:

globus-compute-endpoint start my_mu_ep

config.yaml

The default multi-user endpoint config.yaml file contains one additional field to specify the identity mapping file path:

The default multi-user config.yaml configuration
amqp_port: 443
display_name: null
public: true
identity_mapping_config_path: /root/.globus_compute/my_mu_ep/example_identity_mapping_config.json

Please refer to Manager Endpoint Configuration for details on each field.

example_identity_mapping_config.json

This is a valid-syntax-but-will-never-successfully-map example identity mapping configuration file. It is a JSON list of identity mapping configurations that will be tried in order. By implementation within the endpoint code base, the first configuration to return a match “wins.” In this example, there is only one configuration, an expression_identity_mapping#1.0.0. This means that the match field uses a subset of regular expression syntax[1] to scan the username field from the passed identity set. The library appends the ^ and $ anchors to the regex before searching, so the actual regular expression used would be ^(.*)@example.com$. Finally, if a match is found, the first saved group is the output (i.e., {0}). As an example, if a username field contained mickey97@example.com, then this configuration would return mickey97, and the MEP would then use getpwnam(3) to look up mickey97. But if no username field in any of the identities in the set ended with @example.com, then it would not match and the start request would fail.

The default example identity mapping configuration; technically functional but pragmatically useless
[
  {
    "comment": "For more examples, see: https://docs.globus.org/globus-connect-server/v5.4/identity-mapping-guide/",
    "DATA_TYPE": "expression_identity_mapping#1.0.0",
    "mappings": [
      {
        "source": "{username}",
        "match": "(.*)@example.com",
        "output": "{0}"
      }
    ]
  }
]

Some setups may require an external script or admin-supplied executable to properly map an identity, so this second example showcases the external_identity_mapping#1.0.0 DATA_TYPE. The command is a list of arguments, with the first element as the actual executable. The flags specified here are for illustrative purposes to match the custom_script example. This executable must accept a identity_mapping_input#1.0.0 JSON document via stdin, output a identity_mapping_output#1.0.0 JSON document to stdout, and return a 0 exit code. (A result with a non-zero exit code will be ignored.)

An external program identity mapping configuration example
[
  {
    "DATA_TYPE": "external_identity_mapping#1.0.0",
    "command": ["/root/custom_script", "--some", "flag", "-a", "-b", "-c"]
  },
]

The syntax of this document is defined in the Globus Connect Server Identity Mapping documentation. It is a JSON-list of mapping configurations, and there are two provided strategies to determine a mapping:

  • expression_identity_mapping#1.0.0 — Regular Expression based mapping applies an administrator-defined regular expression against any field in the input identity documents, returning None or the matched string. (Example below.)

  • external_identity_mapping#1.0.0 — Invoke an administrator-defined external process, passing the input identity documents via stdin, and reading the response from stdout.

Observe that as a list, administrators may implement more than one strategy for mapping identities. While the default mapping configuration illustrates the most common approach (regular expression mapping), some setups may require trying multiple avenues to ascertain a proper mapping.

Tip

While developing this file, administrators may appreciate using the globus-idm-validator tool. This script is installed as part of the globus-identity-mapping dependency.

The manager endpoint process watches this file for changes. If an administrator needs to make a live change, simply update the content of the identity mapping file specified by the config.yaml configuration. The manager endpoint process will note the change and atomically apply it: if the new identity mapping configuration is invalid, the previously loaded configuration will remain in place. In both cases (valid or invalid), the endpoint will emit a message to the log.

expression_identity_mapping#1.0.0

For example, a simple policy might require that users of a system have an email address at your institution or department. The identity mapping configuration might be:

only_allow_my_institution.json
[
  {
    "DATA_TYPE": "expression_identity_mapping#1.0.0",
    "mappings": [
      {"source": "{email}", "output": "{0}", "match": "(.*)@your_institution.com"},
      {"source": "{email}", "output": "{0}", "match": "(.*)@cs.your_institution.com"}
    ]
  }
]

A Globus Auth identity (input) document might look something like:

An example identity set, containing two linked identities for the same person.
[
  {
    "id": "00000000-0000-4444-8888-111111111111",
    "email": "alicia@legal.your_institution.com",
    "identity_provider": "abcd7238-f917-4eb2-9ace-c523fa9b1234",
    "identity_type": "login",
    "name": "Alicia",
    "organization": null,
    "status": "used",
    "username": "alicia@legal.your_institution.com"
  },
  {
    "id": "00000000-0000-4444-8888-222222222222",
    "email": "roberto@cs.your_institution.com",
    "identity_provider": "ef345063-bffd-41f7-b403-24f97e325678",
    "identity_type": "login",
    "name": "Roberto",
    "organization": "Your Institution, GmbH",
    "status": "used",
    "username": "roberto@your_institution.com"
  }
]

This user has linked both identities, so both identities are in the identity set. Per the configuration, the first identity will not match either regex, but the second (roberto@your_institution.com) will, and the returned username would be roberto. Note that any field could be tested, but this example used email.

external_identity_mapping#1.0.0

Sometimes, more complicated logic may be required (e.g., LDAP lookups), in which case consider the external_identity_mapping#1.0.0 configuration stanza. The administrator may write a script (or generally, an executable) for the required custom logic. The script will be passed a identity_mapping_input#1.0.0 JSON document via stdin, and must output a identity_mapping_output#1.0.0 JSON document on stdout.

An example identity_mapping_input#1.0.0 document
{
  "DATA_TYPE": "identity_mapping_input#1.0.0",
  "identities": [
    {
      "id": "00000000-0000-4444-8888-111111111111",
      "email": "alicia@legal.your_institution.com",
      "identity_provider": "abcd7238-f917-4eb2-9ace-c523fa9b1234",
      "identity_type": "login",
      "name": "Alicia",
      "organization": null,
      "status": "used",
      "username": "alicia@legal.your_institution.com"
    },
    {
      "id": "00000000-0000-4444-8888-222222222222",
      "email": "roberto@cs.your_institution.com",
      "identity_provider": "ef345063-bffd-41f7-b403-24f97e325678",
      "identity_type": "login",
      "name": "Roberto",
      "organization": "Your Institution, GmbH",
      "status": "used",
      "username": "roberto@your_institution.com"
    }
  ]
}

The executable must identify the successfully mapped identity in the output document by the id field. For example, if an LDAP lookup of alicia@legal.your_institution.com were to result in Alicia for this endpoint host, then the output document might read:

Hypothetical identity_mapping_output#1.0.0 document from an external script
{
  "DATA_TYPE": "identity_mapping_output#1.0.0",
  "result": [
    {"id": "1234567c-cf51-4032-afb8-05986708abcd", "output": "alicia"}
  ]
}

Note

Reminder that the identity mapping configuration is a JSON list. Multiple mappings may be defined, and each will be tried in order until one maps the identity successfully or no mappings are possible.

For a much more thorough dive into identity mapping configurations, please consult the Globus Connect Server’s Identity Mapping documentation.

Starting the Multi-User Endpoint

A multi-user endpoint requires a privileged local user account (e.g., root) to start, enabling the manager endpoint process to perform identity mapping and drop privileges to mapped user accounts. Apart from this initial setup requirement, multi-user endpoints operate identically to regular endpoints for starting and stopping:

# gce start my_mu_ep
      >>> Endpoint ID: [endpoint_uuid] <<<
----> Wed Aug  6 20:03:02 2025

Each user endpoint process runs as the mapped local user, ensuring secure isolation of execution environments:

Multi-user endpoint process hierarchy
Manager Endpoint Process (root)
├── User Endpoint Process (alice, UID: 1001)
├── User Endpoint Process (bob, UID: 1002)
└── User Endpoint Process (eve, UID: 1003)

Warning

When the endpoint runs for the first time, it registers with the Compute API, receiving an identifier — the [endpoint_uuid] in the above console output.

This (endpoint) identifier will be locked for use to specifically and only the same identity going forward.

If the intention is to run this endpoint with service account credentials, be sure to export those credentials at first run:

# GLOBUS_COMPUTE_CLIENT_ID=... GLOBUS_COMPUTE_CLIENT_SECRET=... gce start my_mu_ep

Alternatively:

# export GLOBUS_COMPUTE_CLIENT_ID=...
# export GLOBUS_COMPUTE_CLIENT_SECRET=...
# gce start my_mu_ep

Installing as a Service

Run gce enable-on-boot to install a systemd unit file:

$ gce enable-on-boot my_endpoint
Systemd service installed. Run
   sudo systemctl enable globus-compute-endpoint-my_endpoint.service --now
to enable the service and start the endpoint.

Run gce disable-on-boot for commands to disable and uninstall the service:

$ gce disable-on-boot my-endpoint
Run the following to disable on-boot-persistence:
   systemctl stop globus-compute-endpoint-my-endpoint
   systemctl disable globus-compute-endpoint-my-endpoint
   rm /etc/systemd/system/globus-compute-endpoint-my-endpoint.service

Tip

See the warning in the previous section; typically, endpoints run as a service use client credentials (i.e., need the GLOBUS_COMPUTE_CLIENT_ID and GLOBUS_COMPUTE_CLIENT_SECRET environment variables). A reminder of the syntax in Systemd unit files:

# ...

[Service]
Environment="GLOBUS_COMPUTE_CLIENT_ID=<...identifier...>"
Environment="GLOBUS_COMPUTE_CLIENT_SECRET=<...secret...>"
ExecStart=/path/to/gce start ...
# ...

Common Startup Errors

Rounding out the table in § Endpoint Process Startup Errors, there are a couple of well-known errors that administrators may additionally encounter. Given the local access, the root cause of these exit codes should be evident in the endpoint.log or even on the console. For completeness, however, when run as a multi-user endpoint, the following two exit codes are possible:

Possible endpoint process exit codes

Python os constant name

Integer value

Likely Reason

os.EX_NOPERM

77

Missing required permissions to read the identity mapping configuration file.

os.EX_CONFIG

78

Unknown problem reading or parsing the identity mapping configuration file.

Pluggable Authentication Modules (PAM)

Pluggable Authentication Modules (PAM) allows administrators to configure site-specific authentication schemes with arbitrary requirements. For example, where one site might require users to use MFA, another site could disallow use of the system for some users at certain times of the day. Rather than rewrite or modify software to accommodate each site’s needs, administrators can simply change their site configuration.

As a brief intro to PAM, the architecture is designed with four phases:

  • authentication

  • account management

  • session management

  • password management

The multi-user endpoint implements account and session management. If enabled, then the child process will create a PAM session, check the account (pam_acct_mgmt(3)), and then open a session (pam_open_session(3)). If these two steps succeed, then the manager endpoint process will continue to drop privileges. But in these two steps is where the administrator can implement custom configuration.

PAM is configured in two parts. For the config.yaml file, use the pam field:

config.yaml to show PAM
identity_mapping_config_path: .../some/idmap.json
pam:
  enable: true

This configuration will choose the default PAM service name, globus-compute-endpoint (see PamConfiguration). The service name is the name of the PAM configuration file in /etc/pam.d/. Use service_name to tell the endpoint to authorize users against a different PAM configuration:

config.yaml with a custom PAM service name
identity_mapping_config_path: .../some/idmap.json
pam:
  enable: true

  # the PAM routines will look for `/etc/pam.d/gce-ep123-specific-requirements`
  service_name: gce-ep123-specific-requirements

For clarity, note that the service name is simply passed to pam_start(3), to tell PAM which service configuration to apply.

Important

If PAM is not enabled, then before starting user endpoint processes, the child process drops all capabilities and sets the no-new-privileges flag with the kernel. (See prctl(2) and reference PR_SET_NO_NEW_PRIVS). In particular, this will preclude use of SETUID executables, which can break some schedulers. If your site requires use of SETUID executables, then PAM must be enabled.

Though configuring PAM itself is outside the scope of this document (e.g., see The Linux-PAM System Administrators’ Guide), we briefly discuss a couple of modules to share a taste of what PAM can do. For example, if the administrator were to implement a configuration of:

/etc/pam.d/globus-compute-endpoint
account   requisite     pam_shells.so
session   required      pam_limits.so

then, per pam_shells(8), any user endpoint process for a user whose shell is not listed in /etc/shells will not start and the logs will have a line like:

... (error code: 7 [PAM_AUTH_ERR]) Authentication failure

On the other end, the user’s SDK would receive a message like:

Request payload failed validation: Unable to start user endpoint process for jessica [exit code: 71; (PermissionError) see your system administrator]

Similarly, for users who are administratively allowed (i.e., have a valid shell), the pam_limits(8) module will install the admin-configured process limits.

Hint

The Globus Compute Endpoint software implements the account management and session phases of PAM. As authentication is enacted via Globus Auth and Identity Mapping, it does not use PAM’s authentication (pam_authenticate(3)) phase, nor does it attempt to manage the user’s password. Functionally, this means that only PAM configuration lines that begin with account and session will be utilized.

Look to PAM for a number of tasks (which we tease here, but are similarly out of scope of this documentation):

(If the available PAM modules do not fit the bill, it is also possible to write a custom module! But sadly, that is also out of scope of this documentation; please see The Linux-PAM Module Writers’ Guide.)

Authentication Policies

Administrators can use a Globus authentication policy to limit access to a multi-user endpoint by enforcing that the user has appropriate identities linked to their Globus account and that the required identities have recent authentications.

Please refer to Authentication Policies for more information.

With Globus OIDC

Administrators can create a custom OIDC server by following the Globus OIDC guide. (Note that this requires an existing Globus Connect Server endpoint.) This OIDC server can then be combined with Globus Auth Policies to authenticate users on a multi-user endpoint.

With a Globus OIDC server configured, run gcs oidc show to retrieve the OIDC server’s configured domain:

$ globus-connect-server oidc show
Current OIDC server configuration:
{
   "auth_client": {
      "client_id": "00000000-1111-2222-3333-444444444444",
      "domain": "<some custom OIDC domain>",
      "env": "production"
   },
   "clients": {
      "00000000-1111-2222-3333-444444444444": {
         "client_salt": "NOT_ACTUALLY_USED",
         "redirect_uris": [
            [
               "https://auth.globus.org/p/authenticate/callback",
               null
            ]
         ]
      }
   },
   "oidc_server": {
      "display_name": "My Globus OIDC Server",
      "pam_service": "login",
      "support_contact": "Alice Administrator",
      "support_email": "alice@example.org"
   }
}

Note

The value of the domain field will be referenced in the following steps as <OIDC-domain>.

When configuring a new multi-user endpoint, use the --allowed-domains option to restrict access to users authenticated via the OIDC server:

$ gce configure --allowed-domains "<OIDC-domain>" my_oidc_compute_endpoint

To apply the same restriction to an existing multi-user endpoint, create an authentication policy using either the Globus Auth API or the Globus SDK, with domain_constraints_include set to something like [<OIDC-domain>]. Then, add that policy to the endpoint config.

After configuring the endpoint to authenticate against the OIDC server, start or restart the endpoint to ensure the administrator running the endpoint is also properly authenticated against the same OIDC server.

Finally, create an identity mapping configuration so OIDC-authenticated users can run tasks on the endpoint. The following config maps identities of the form user@<OIDC-domain> to the local username user (which must exist on the endpoint host system):

identity_mapping.json
[
   {
      "comment": "Map OIDC identities to local usernames",
      "DATA_TYPE": "expression_identity_mapping#1.0.0",
      "mappings": [
         {
            "source": "{username}",
            "match": "(.*)@<OIDC-domain>",
            "output": "{0}"
         }
      ]
   }
]

Save this configuration to a file (e.g., identity_mapping.json) and reference it in the endpoint’s config.yaml under the identity_mapping key:

config.yaml
identity_mapping: /path/to/identity_mapping.json

Administrator Quickstart

  1. Install the Globus Compute Agent package

  2. Quickly verify that installation succeeded and the shell environment points to the correct path:

    # command -v globus-compute-endpoint
    /usr/sbin/globus-compute-endpoint
    
  3. Create a Multi-User Endpoint configuration by running the configure subcommand as a privileged user (e.g., root):

    # globus-compute-endpoint configure prod_gpu_large
    Created multi-user profile for endpoint named <prod_gpu_large>
    
        Configuration file: /root/.globus_compute/prod_gpu_large/config.yaml
    
        Example identity mapping configuration: /root/.globus_compute/prod_gpu_large/example_identity_mapping_config.json
    
        User endpoint configuration template: /root/.globus_compute/prod_gpu_large/user_config_template.yaml.j2
        User endpoint configuration schema: /root/.globus_compute/prod_gpu_large/user_config_schema.json
        User endpoint environment variables: /root/.globus_compute/prod_gpu_large/user_environment.yaml
    
    Use the `start` subcommand to run it:
    
        $ globus-compute-endpoint start prod_gpu_large
    
  4. Set up the identity mapping configuration — this depends on your site’s specific requirements and may take some trial and error. The key point is to be able to take a Globus Auth Identity set, and map it to a local username on this resource — this resulting username will be passed to getpwnam(3) to ascertain a UID for the user. This file is linked in config.yaml (from the previous step’s output), and, per initial configuration, is set to example_identity_mapping_config.json. While the configuration is syntactically valid, it references example.com so will not work until modified. Please refer to the Globus Connect Server Identity Mapping Guide for help updating this file.

  5. Modify user_config_template.yaml.j2 as appropriate for the resources to make available. This file will be interpreted as a Jinja template and will be rendered with user-provided variables to generate the final user endpoint process configuration. The default configuration (as created in step 4) has a basic working configuration, but uses the LocalProvider.

    Please look to Example Configurations as a starting point.

  6. Optionally modify user_config_schema.json; the file, if it exists, defines the JSON schema against which user-provided variables are validated. Writing JSON schemas is out of scope for this documentation, but we do specifically recognize additionalProperties: true which makes the default schema very permissive: any key not specifically specified in the schema is treated as valid.

  7. Modify user_environment.yaml for any environment variables that should be injected into the user endpoint process space:

    SOME_SITE_SPECIFIC_ENV_VAR: a site specific value
    PATH: /site/specific:/path:/opt:/usr:/some/other/path
    
  8. Run multi-user endpoint manually for testing and easier debugging, as well as to collect the endpoint ID for sharing with users. The first time through, the endpoint will initiate a Globus Auth login flow, and present a long URL:

    # globus-compute-endpoint start prod_gpu_large
    > Endpoint Manager initialization
    Please authenticate with Globus here:
    ------------------------------------
    https://auth.globus.org/v2/oauth2/authorize?clie...&prompt=login
    ------------------------------------
    
    Enter the resulting Authorization Code here: <PASTE CODE HERE AND PRESS ENTER>
    
  9. While iterating, the --log-to-console flag may be useful to emit the log lines to the console (also available at .globus_compute/prod_gpu_large/endpoint.log).

    # globus-compute-endpoint start prod_gpu_large --log-to-console
    >
    
    ========== Endpoint Manager begins: 1ed568ab-79ec-4f7c-be78-a704439b2266
            >>> Multi-User Endpoint ID: 1ed568ab-79ec-4f7c-be78-a704439b2266 <<<
    

    Additionally, for even noisier output, there is --debug.

  10. When ready to install as an on-boot service, install it with a systemd unit file:

    # globus-compute-endpoint enable-on-boot prod_gpu_large
    Systemd service installed at /etc/systemd/system/globus-compute-endpoint-prod_gpu_large.service. Run
        sudo systemctl enable globus-compute-endpoint-prod_gpu_large --now
    to enable the service and start the endpoint.
    

    And enable via the usual interaction:

    # systemctl enable globus-compute-endpoint-prod_gpu_large --now