Usage
guild run [OPTIONS] [[MODEL:]OPERATION] [FLAG=VAL...]
Run an operation.
By default Guild tries to run OPERATION
for the default model defined in the current project.
If MODEL
is specified, Guild uses it instead of the default model.
OPERATION
may alternatively be a Python script. In this case any current project is ignored and the script is run directly. Options in the format --NAME=VAL
can be passed to the script using flags (see below).
[MODEL]:OPERATION
may be omitted if --restart
or --proto
is specified, in which case the operation used in RUN
is used.
Specify FLAG
values in the form FLAG=VAL
.
Batch Files
One or more batch files can be used to run multiple trials by specifying the file path as @PATH
.
For example, to run trials specified in a CSV file named trials.csv
, run:
guild run [MODEL:]OPERATION @trials.csv
NOTE: At this time you must specify the operation with batch files - batch files only contain flag values and cannot be used to run different operations for the same command.
Batch files may be formatted as CSV, JSON, or YAML. Format is determined by the file extension.
Each entry in the file is used as a set of flags for a trial run.
CSV files must have a header row containing the flag names. Each subsequent row is a corresponding list of flag values that Guild uses for a generated trial.
JSON and YAML files must contain a top-level list of flag-to-value maps.
Use --print-trials
to preview the trials run for the specified batch files.
Flag Lists
A list of flag values may be specified using the syntax [VAL1[,VAL2]...]
. Lists containing white space must be quoted. When a list of values is provided, Guild generates a trial run for each value. When multiple flags have list values, Guild generates the cartesian product of all possible flag combinations.
Flag lists may be used to perform grid search operations.
For example, the following generates four runs for operation train
and flags learning-rate
and batch-size
:
guild run train learning-rate[0.01,0.1] batch-size=[10,100]
You can preview the trials generated from flag lists using --print-trials
. You can save the generated trials to a batch file using --save-trials
. For more information, see PREVIEWING AND SAVING TRIALS below.
When --optimizer
is specified, flag lists may take on different meaning depending on the type of optimizer. For example, the random
optimizer randomly selects values from a flag list, rather than generate trials for each value. See OPTIMIZERS for more information.
Optimizers
A run may be optimized using --optimizer
. An optimizer runs up to --max-trials
runs using flag values and flag configuration.
For details on available optimizers and their behavior, refer to Optimizers Reference - Reference - Guild AI.
Limit Trials
When using flag lists or optimizers, which generate trials, you can limit the number of trials with --max-trials
. By default, Guild limits the number of generated trials to 20.
Guild limits trials by randomly sampling the maximum number from the total list of generated files. You can specify the seed used for the random sample with --random-seed
. The random seed is guaranteed to generate consistent results when used on the same version of Python. When used across different versions of Python, the results may be inconsistent.
Preview or Save Trials
When flag lists (used for grid search) or an optimizer is used, you can preview the generated trials using --print-trials
. You can save the generated trials as a CSV batch file using --save-trials
.
Start an Operation Using a Prototype Run
If --proto
is specified, Guild applies the operation, flags, and source code used in RUN
to the new operation. You may add or redefine flags in the new operation. You may use an alternative operation, in which case only the flag values and source code from RUN
are applied. RUN
must be a run ID or unique run ID prefix.
Restart an Operation
If --restart
is specified, RUN
is restarted using its operation and flags. Unlike --proto
, restart does not create a new run. You cannot change the operation, flags, source code, or run directory when restarting a run.
Staging an Operation
Use --stage
to stage an operation to be run later. Use --start
with the staged run ID to start it.
If --start
is specified, RUN
is started using the same rules applied to --restart
(see above).
Alternate Run Directory
To run an operation outside of Guild’s run management facility, use --run-dir
or --stage-dir
to specify an alternative run directory. These options are useful when developing or debugging an operation. Use --stage-dir
to prepare a run directory for an operation without running the operation itself. This is useful when you want to verify run directory layout or manually run an operation in a prepared directory.
NOTE: Runs started with --run-dir
are not visible to Guild and do not appear in run listings.
Control Visible GPUs
By default, operations have access to all available GPU devices. To limit the GPU devices available to a run, use --gpus
.
For example, to limit visible GPU devices to 0
and 1
, run:
guild run --gpus 0,1 ...
To disable all available GPUs, use --no-gpus
.
NOTE: --gpus
and --no-gpus
are used to construct the CUDA_VISIBLE_DEVICES
environment variable used for the run process. If CUDA_VISIBLE_DEVICES
is set, using either of these options redefines that environment variable for the run.
Optimize Runs
Use --optimizer
to run the operation multiple times in attempt to optimize a result. Use --minimize
or --maximize
to indicate what should be optimized. Use --max-runs
to indicate the maximum number of runs the optimizer should generate.
Edit Flags
Use --edit-flags
to use an editor to review and modify flag values. Guild uses the editor defined in VISUAL
or EDITOR
environment variables. If neither environment variable is defined, Guild uses an editor suitable for the current platform.
Debug Source Code
Use --debug-sourcecode
to specify the location of project source code for debugging. Guild uses this path instead of the location of the copied soure code for the run. For example, when debugging project files, use this option to ensure that modules are loaded from the project location rather than the run directory.
Breakpoints
Use --break
to set breakpoints for Python based operations. LOCATION
may be specified as [FILENAME:]LINE
or as MODULE.FUNCTION
.
If FILENAME
is not specified, the main module is assumed. Use the value 1
to break at the start of the main module (line 1).
Relative file names are resolved relative to the their location in the Python system path. You can omit the .py
extension.
If a line number does not correspond to a valid breakpoint, Guild attempts to set a breakpoint on the next valid breakpoint line in the applicable module.
Options
-l, --label LABEL |
Set a label for the run. |
-t, --tag TAG |
Associate TAG with run. May be used multiple times. |
-c, --comment COMMENT |
Comment associated with the run. |
-ec, --edit-comment |
Use an editor to type a comment. |
-e, --edit-flags |
Use an editor to review and modify flags. |
-d, --run-dir DIR |
Use alternative run directory DIR. Cannot be used with --stage. |
--stage |
Stage an operation. |
--start, --restart RUN |
Start a staged run or restart an existing run. Cannot be used with --proto or --run-dir. |
--proto RUN |
Use the operation, flags and source code from RUN. Flags may be added or redefined in this operation. Cannot be used with --restart. |
--force-sourcecode |
Use working source code when --restart or --proto is specified. Ignored otherwise. |
--gpus DEVICES |
Limit availabe GPUs to DEVICES, a comma separated list of device IDs. By default all GPUs are available. Cannot beused with --no-gpus. |
--no-gpus |
Disable GPUs for run. Cannot be used with --gpus. |
-bl, --batch-label LABEL |
Label to use for batch runs. Ignored for non-batch runs. |
-bt, --batch-tag TAG |
Associate TAG with batch. Ignored for non-batch runs. May be used multiple times. |
-bc, --batch-comment COMMENT |
Comment associated with batch. |
-ebc, --edit-batch-comment |
Use an editor to type a batch comment. |
-o, --optimizer ALGORITHM |
Optimize the run using the specified algorithm. See Optimizing Runs for more information. |
-O, --optimize |
Optimize the run using the default optimizer. |
-N, --minimize COLUMN |
Column to minimize when running with an optimizer. See help for compare command for details specifying a column. May not be used with --maximize. |
-X, --maximize COLUMN |
Column to maximize when running with an optimizer. See help for compare command for details specifying a column. May not be used with --minimize. |
-Fo, --opt-flag FLAG=VAL |
Flag for OPTIMIZER. May be used multiple times. |
-m, --max-trials, --trials N |
Maximum number of trials to run in batch operations. Default is optimizer specific. If optimizer is not specified, default is 20. |
--random-seed N |
Random seed used when sampling trials or flag values. |
--debug-sourcecode PATH |
Specify an alternative source code path for debugging. See Debug Source Code below for details. |
--stage-trials |
For batch operations, stage trials without running them. |
-r, --remote REMOTE |
Run the operation remotely. |
-y, --yes |
Do not prompt before running operation. |
-f, --force-flags |
Accept all flag assignments, even for undefined or invalid values. |
--force-deps |
Continue even when a required resource is not resolved. |
--stop-after N |
Stop operation after N minutes. |
--fail-on-trial-error |
Stop batch operations when a trial exits with an error. |
--needed |
Run only if there is not an available matching run. A matching run is of the same operation with the same flag values that is not stopped due to an error. |
-b, --background |
Run operation in background. |
--pidfile PIDFILE |
Run operation in background, writing the background process ID to PIDFILE. |
-n, --no-wait |
Don’t wait for a remote operation to complete. Ignored if run is local. |
--save-trials PATH |
Saves generated trials to a CSV batch file. See BATCH FILES for more information. |
--keep-run |
Keep run even when configured with ‘delete-on-success’. |
--keep-batch |
Keep batch run rather than delete it on success. |
-D, --dep PATH |
Include PATH as a dependency. |
--break LOCATION |
Set a breakpoint at the specified location for Python based operations. Set LOCATION to 1 to break at line 1 of the main module. See Breakpoints above for LOCATION format. Use multiple times for more than one breakpoint. |
--break-on-error |
Enter the Python debugger at the point an error occurs for Python based operations. |
-q, --quiet |
Do not show output. |
--print-cmd |
Show operation command and exit. |
--print-env |
Show operation environment and exit. |
--print-trials |
Show generated trials and exit. |
--help-model |
Show model help and exit. |
-h, --help-op |
Show operation help and exit. |
--test-output-scalars OUTPUT |
Test output scalars on output. Use ‘-’ to read from standard intput. |
--test-sourcecode |
Test source code selection. |
--test-flags |
Test flag configuration. |
--help |
Show this message and exit. |
Guild AI version 0.9.0