Overview
Guild supports training on remote system by way of a remote facility.
- Define a remote in user configuration
- Specify the remote name using the
--remote
option when running an operation
For a complete reference on remote configuration, see Remotes Reference.
Define a Remote
Remotes are defined in user configuration. Below is an example of an SSH remote named remote-gpu
:
remotes:
remote-gpu:
type: ssh
host: gpu001.mydomain.com
user: ubuntu
private-key: ~/.ssh/gpu001.pem
Guild supports the following remote types:
ssh | Connect to a remote server over SSH. Use this type to train on remote servers on-premises or on any cloud vendor. Guild does not support support starting of ssh remote types. |
ec2 | Connect to a remote EC2 host over SSH. This remote type supports start and stop remote commands given EC2 specific configuration for the remote. |
s3 | Copy runs to and from S3. This remote type does not support runs but can be used for backup and restore. |
azure-vm | Connect to a remote Azure host over SSH. |
azure-blob | Copy runs to and from Azure blob storage. This remote type does not support runs but can be used for backup and restore. |
gist | Copy runs to and from GitHub gists. This remote type does not support runs but can be used for backup and restore. |
For a complete list of remote types, including examples, see Remotes Reference.
Manage Remotes
Remotes can be listed, checked for status, and, if supported by the remote type, started and stopped.
Remote management commands:
guild remotes |
List available remotes. |
guild remote status |
Show status for a remote. |
guild remote start |
Start a remote. Not all remote types can be started. |
guild remote stop |
Stop a remote. Not all remote types can be stopped. |
A remote must be available before it can be used in a remote command. Check a remote using guild remote status
. If a remote is not available and can be started, use guild remote start
to start it first. Note that some remote types cannot be started or stopped. Refer to Remotes Reference for detail on each remote type.
Remote Commands
To run apply a command to a remote, use the --remote
option. For example, to run guild check
on a remote named remote-gpu
(see example above), run:
guild check --remote remote-gpu
Not all remote types support every command. For example, the s3
remote type does not support the run
command. Refer to Remotes Reference for details on which remote commands are support for a particular remote type.
Guild commands that support remotes:
check |
Check Guild on the remote |
run |
Run an operation on a remote |
stop |
Stop runs in progress on a remote |
watch |
Connect to a remote run in progress and watch its output |
runs |
List runs on a remote |
runs info |
Show information about a remote run |
ls |
List remote run files |
diff |
Diff remote runs |
cat |
Show remote run file or output |
label |
Apply a label to one or more remote runs |
runs delete |
Delete remote runs |
runs restore |
Restore deleted remote runs on a remote |
runs purge |
Purge deleted remote runs on a remote |
pull |
Copy remote runs to the local environment |
push |
Copy local runs to the remote |